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95 (54) Title: GENE EXPRESSION SYSTEM 
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® (57) Abstract: There are provided DNA constructs, including replicable cloning vectors and expression vectors, comprising a bac- 
Q teriophage promoter operably linked to an outron sequence. The expression vectors provided by the invention are useful in the 
expression of recombinant polypeptides in host cells or organisms and are particularly useful in expression of recombinant polypep- 
^ tides in nematode worms such as C elegans. 
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GENE EXPRESSION SYSTEM 

Field of the invention 

5 The invention relates to the expression of DNA, 

genes, cDNAs, proteins, peptides and parts thereof in 
the nematode worm C. elegans. In particular, the 
invention relates to methods of improving the 
translation of RNAs transcribed in C. elegans using a 
10 bacteriophage polymerase by introduction of a trans- 
splice recognition site recognised by an SL1 trans- 
splice recognition sequence into the DNA template 
transcribed by the bacteriophage polymerase. 

15 Background to the invention 

Eukaryotic versus prokarvotic expression. 

Bacteriophage RNA polymerases, such as T7, T3, 
and SP6, and their corresponding promoters have been 

20 used extensively to drive the expression of 

heterologous genes in a variety of organisms. In co- 
pending International patent application No. WO 
00/01846, Plaetinck et al. describe the use of the T7 
system to express DNA, genes, cDNA, proteins and 

25 peptides of parts thereof and for the expression of 
double-stranded RNA (dsRNA) in the nematode model 
system C. elegans. 

The bacteriophage expression systems are well 
known in the art for use in prokaryotic host cells, 

30 such as E, coli, and have the advantage that they 

provide simple and strong expression systems dependent 
only on one RNA polymerase and one well defined 
promoter. The application of such efficient 
expression systems in eukaryotic organisms is, 

35 however, not evident, mainly because messenger RNAs 
from eukaryotes and prokaryotes have a different 
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structure, which has implications for translation 
efficiency and RNA stability. 

Messenger RNAs of higher eukaryotes share a 
functionally essential 5 1 CAP structure. This 
5 structure is generated during a capping reaction that 
is linked exclusively to RNA polymerase II 
transcription. Prokaryotic RNA polymerases such as 
bacteriophage T3, T7 and SPG polymerases do not 
provide messenger RNAs with such a CAP structure, 
10 leading to inefficient translation in eukaryotic 
systems (Fuerst et al. J. Mol. Biol : 206: 333-348 
(1989) ) . 

One way to improve translation of uncapped mRNAs 
in eukaryotic systems is by the insertion of an 

15 internal ribosome entry site (IRES) sequence 5 1 of the 
coding sequence. For example, Elroy-Stein, et al . , 
Proc. Natl. Acad. Sci. USA 87:6743-6747 (1990), 
describe the cloning of the untranslated region of the 
ECMV virus downstream of the T7 promoter in order to 

20 enhance the efficiency of translation. In other 

* systems translation of T7-derived transcripts may be 
enhanced by addition of a CAP structure derived from a 
capped transcript. For example, in Trypanosoma a 5* 
CAP structure is added to T7 generated RNA transcripts 

25 by a natural occurring trans-splicing reaction (Wirtz 
et al. NAR 22:3887-3894 (1994)). 

Trans-splicing in C. eleaans. 

In C. elegans many mRNAs contain an identical 

30 short leader sequence, designated the spliced leader 

(SL) „ This splice leader is donated by a small RNA (SL 
RNA) via a trans-splicing reaction. This trans 
splicing was first observed by Krause et al. r Cell 
49:753-61 (1987). The splice leader RNA exists as a 

35 small nuclear ribonucleoprotein particle and has the 
trimethylguanosine cap that is characteristic of 
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eukaryotic small nuclear RNAs. The trimethylguanosine 
cap present on the spliced leader RNA is transferred 
to the pre-mRNA during the trans-splicing reaction. 
Thereafter, the trimethylguanosine cap is maintained 
5 on the mature mRNA (Van Doren et al., Mai. Cell. Biol. 
10:1769-1772 (1990). The trans-splicing signal for 
such a splice leader is essentially an intron missing 
only the 5' splice site, designated an ^outron' . An 
outron has essentially all the intron sequence 

10 including a trans-splice acceptor site homologous to a 
UUUCAG sequence preceded by a AO rich region (Conrad 
et al., NAR 21:913-919 (1993). Introduction of an 
outron into the 5' untranslated region of a C. elegans 
gene converts it to a trans-spliced gene (Conrad et 

15 al., EMBO J. 12:1249-1255 (1993); Conrad et al. Mol. 
Cell Biol. 11:1931-1926 (1991)) and introduction of 
donor sites in a natural trans-spliced C. elegans gene 
prevents trans-splicing and converts it into a more 
conventional gene. 

20 

Description of the invention. 

Until recently, expression of heterologous and 
homologous genes in C. elegans was mainly achieved by 
linking an appropriate coding sequence to a selected 

25 C. elegans promoter. The present inventors have 
recently demonstrated that the recombinant gene 
expression in C. elegans can be based on the 
prokaryotic T7 expression system (WO 00/01846) . 
However, the present inventors found that the 

30 expression system was far from being efficient, or at 
least the resulting expression was much lower than 
would be expected from this T7 related expression 
system. it was concluded that this low expression was 
mainly due to RNA instability or translation arrest. 

35 Furthermore, it was reasoned that fundamental 
differences between prokaryotic and eukaryotic 
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expression systems, particularly the requirement for 
capping of the 5' end of the mRNA for efficient 
translation in eukaryotic systems,, was the main reason 
for this unexpectedly low expression. 
5 The inventors have now developed a solution to 

the problem of the inefficiency of the T7 system in 
eukaryotic host cells and organisms, particularly in 
C. elegans, and have constructed a generally 
applicable expression system which allows for the 
10 efficient expression of genes, DNA, cDNA, peptides and 
proteins under the regulation of the T7 promoter in C. 
elegans. 

Therefore, in accordance with a first aspect of 
the invention there is provided a DNA construct 

15 comprising a bacteriophage promoter operably linked to 
an outron sequence. 

It is an essential feature of the DNA construct 
of the invention that the bacteriophage promoter and 
the outron sequence are "operably linked", that is to 

20 say they are arranged in a relationship permitting 
them to function in their intended manner. In this 
case, the bacteriophage promoter is positioned 
upstream of the outron sequence such that it is 
capable of promoting transcription of the outron 

25 sequence upon binding of an appropriate RNA 

polymerase, with the outron sequence forming the 
extreme 5' end of the resulting transcript. 

The DNA construct may further comprise at least 
one restriction enzyme recognition site positioned 

30 downstream of and proximal to the outron sequence. 

Advantageously, the DNA construct may contain multiple 
restriction sites forming a multi-cloning site. The 
purpose of the restriction site/multi-cloning site is 
to facilitate cloning of a heterologous or homologous 

35 DNA fragment downstream of the outron sequence. A DNA 
construct comprising a bacteriophage promoter, an 
outron sequence and a restriction site/multi-cloning 
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site may therefore be referred to hereinafter as an 
^outron cloning construct' . 

In an outron cloning construct it is advantageous 
for the restriction site/multi-cloning site to be 
5 positioned fairly proximal to the outron sequence 
(e.g. within lOObp) such that a heterologous or 
homologous sequence inserted at this site may be co- 
transcribed with the outron sequence on a single mRNA. 
However, further sequence elements may be interposed 

10 between the outron sequence and the restriction 

site/multi-cloning site. For example, the general 
purpose vector pDW3123 described in the accompanying 
examples has a synthetic intron A sequence between the 
outron sequence and the multi-cloning site. 

15 In one preferred embodiment of the invention, the 

DNA construct is a replicable cloning vector, such as, 
for. example, a plasmid vector. In addition to the 
bacteriophage promoter, outron sequence and optional 
restriction site/multi-cloning site, the vector may 

20 further contain one or more of the general features 
commonly found in cloning vectors, for example an 
origin of replication to allow autonomous replication 
within a host cell and a selective marker, such as an 
antibiotic resistance gene. - Although not essential, 

25 the vector may also contain a poly-adenylation signal 
to stabilize and process the 3* end of the mRNA 
transcribed from the bacteriophage promoter. A 
preferred example is the 3 1 UTR from the C. elegans 
unc-54 gene, but any other 3 1 UTR or polyadenylation 

30 signal may be used. 

Outron-containing DNA constructs according to the 
invention may be easily be constructed from the 
component sequence elements using standard recombinant 
techniques well known in the art and described, for 

35 example, in F. M. Ausubel et al. (eds.), Current 

Protocols in Molecular Biology , John Wiley & Sons, 
Inc. (1994). 
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Outron sequences for use in the constructs of the 
invention may be isolated from natural C. elegans 
genes using standard molecular biology techniques. 
For example , a natural outron sequence might be 
5 amplified using the polymerase chain reaction or an 
equivalent amplification technique using C. 
elegans genomic DNA as a template. Alternatively,, 
synthetic outron sequences may be synthesised, for 
example, by annealing two complementary single 

10 stranded oligonucleotides, as illustrated in the 

accompanying examples. Once a DNA fragment comprising 
the outron sequence has been obtained in would be a 
matter of routine to assemble an outron construct by 
linking the outron in the correct orientation relative 

15 to the bacteriophage promoter. 

The sequences of the commonly used bacteriophage 
promoters, e.g. T7, T3 and SP6, are well known in the 
art and oligonucleotides containing functional phage 
promoter sequences can be readily synthesised using 

20 standard oligonucleotide synthesis techniques. It 
would be a matter of routine to insert such a 
synthetic promoter sequence into, for example, a 
plasmid vector backbone containing, for example, an 
origin of replication a selective marker and a 

25 suitable restriction site. Alternatively, one of the 
many plasmid vectors containing bacteriophage promoter 
sequences known in the art may be used as the starting 
point for the construction of a plasmid-based outron 
cloning vector. The known vectors generally contain, 

30 in addition to the phage promoter sequence, one or 
more restriction sites conveniently positioned 
downstream of the phage promoter and also a bacterial 
origin of replication and a selective marker. Once 
the vector backbone is in place the outron sequence 

35 may simply be inserted in the appropriate position 
downstream of the bacteriophage promoter. 

In a particularly useful embodiment the invention 
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provides a DNA construct for use in bacteriophage 
promoter-driven expression of a polypeptide in a 
eukaryotic host cell or organism. This construct 
comprises a bacteriophage promoter operably linked to 
5 a DNA sequence such that it is capable of initiating 
transcription of the DNA sequence upon binding of an 
appropriate RNA polymerase to the promoter, wherein 
the aforesaid DNA sequence comprises an outron 
sequence and at least one open reading frame 

10 positioned downstream of the outron sequence. 

The open reading frame may be essentially any 
protein-encoding DNA sequence bounded by start and 
stop codons. This protein-encoding DNA sequence may 
include introns, as both trans-splicing and cis- 

15 splicing can occur together. 

A DNA construct according to this embodiment of 
the. invention, which may be referred to hereinafter as 
an 'outron expression construct', may be derived from 
an outron cloning construct by insertion of a 

20 heterologous or homologous protein-encoding DNA 

fragment into the restriction site/multi-cloning site. 
It is essential that the heterologous or homologous 
DNA fragment be inserted downstream of the outron 
sequence such that the two sequences may be co- 

25 transcribed, with the outron sequence forming part of 
the 5 1 untranslated region of the resulting mRNA. 

The outron expression construct may . 
advantageously form an expression vector, such as, for 
example, a plasmid vector. . Most preferably, the 

30 expression {vector will be one suitable for use in the 
nematode worm C. elegans. In addition to the 
bacteriophage promoter, outron sequence and protein- 
encoding DNA sequence (open reading frame), the 
expression vector may further contain one or more of 

35 the general features commonly found in expression 

vectors, for example an origin of replication to allow 
autonomous replication within a bacterial host cell 
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and a selective marker, such as an antibiotic 
resistance m gene. The vector may also contain a 
poly-adenylation signal to stabilize and process the 
3 1 end o±" the mRNA transcribed from the bacteriophage 
5 promoter, A preferred example is the 3 ! UTR from the 
C. elegans unc-54 gene, but any other 3 ! UTR or 
polyadenylation signal may be used. An additional 
element, such as for example a synthetic intron, may 
be interposed between the outron sequence and the open 

10 reading frame. 

It is important that the open reading frame is 
positioned downstream of and proximal to the outron 
sequence in the expression construct such that (i) the 
two elements are co-transcribed to form a single mRNA 

15 and (ii) the outron sequence forms part of the 5' 

untranslated region of the mRNA. If the appropriate 
splicing machinery and a supply of SL RNAs is provided 
by the eukaryotic host cell or organism then the 
uncapped 5 1 end of the pre-mRNA transcribed from the 

20 expression construct will be replaced with a capped 
splice leader via the trans-splicing reaction. This 
will greatly increase the efficiency of translation in 
a eukaryotic host system. 

The use of an outron sequence at the extreme 5' 

25 end of the RNA provides a solution to the problem of 
reduced expression efficiency in eukaryotic systems 
wherever the type of promoter/polymerase used to drive 
gene expression leads to the production of uncapped 
transcripts, provided that the host cell or organism 

30 produces the spliced leader RNAs required for the 
trans-splicing reaction. 

Outron sequences which may be utilised in 
accordance with the invention include naturally 
occurring outron. sequences isolated from SLl-specific 

35 C. elegans genes (Conrad, R. Functional analysis of a 
C. elegans trans-splice acceptor. Nucleic Acids Res. 
1993, 21(4), pp913-919; Conrad, r. SL1 trans-splicing 
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specified by AU-rich synthetic RNA inserted at the 5' 
end of Caenorhabditis elegans pre-mRNA. RNA. 1995, 
1(2), ppl64-170) and also synthetic outron sequences 
which are functionally equivalent to the natural C. 
5 elegans outron sequences, including variants of 

naturally occurring C. elegans outron s . The phrase 
"functionally equivalent" means that the synthetic 
intron is recognised by the C. elegans trans-splicing 
machinery and can be trans-spliced to a C. elegans 

10 splice leader RNA, preferably the SL1 splice leader . 

Experimental evidence indicates that trans- 
splicing in C. eJegans is signalled by an AO-rich 
intron-like sequence followed by a splice acceptor 
site (Conrad et al 1993 and 1995). For the purposes 

IS of the present application the terms "outron" or 

"outron sequence" should be interpreted as referring 
to both the AU-rich region from the 5 1 end of the pre- 
mRNA to the trans-splice acceptor site and the trans- 
splice acceptor site itself. In connection with the 

20 DNA constructs of the invention, the terms "outron" 

and "outron sequence" refer to features present in the 
DNA which encodes the pre-mRNA. 

The consensus splice acceptor site for trans- 
splicing of outrons and the consensus 3' splice 

25 acceptor site for cis-splicing of introns are 

essentially identical (UUUCAG) . Moreover, a normally 
trans-spliced acceptor site can be efficiently cis- 
spliced when a donor splice site is inserted upstream 
within the outron sequence. It is therefore important 

30 that the outron constructs described herein do not 

contain any potential splice donor sequence upstream 
of the splice acceptor within the outron and 
downstream of the transcription start site such that 
it will be transcribed in the mRNA encoded by the 

35 construct. If such a site were present than there 
would be a potential for cis-splicing rather than 
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trans-splicing. 

It has also been observed that the overall length 
of the outron has an effect on the efficiency of 
trans-splicing, longer outrons in general working 
5 better than shorter ones (Conrad et al. 1995} . 

Advantageously, the outron sequences for inclusion 
into the outron constructs described herein should be 
greater than about 50nt in length. 

A synthetic outron containing an AT stretch and a 

10 TTTTCAG sequence has been shown to be functional in C. 
elegans. As illustrated in the accompanying Examples, 
the insertion of an outron sequence into the 5 F 
untranslated region of GFP reporter construct, 
downstream of the promoter and upstream of the GFP 

15 open reading frame, is required for optimal expression 
of bacteriophage RNA polymerase transcribed reporter 
gene mRNA in C. elegans. 

Suitable bacteriophage promoters which may be" 
used in the DNA constructs according to the invention 

20 include T7, T3 and SP6 promoters, with T7 being the 
most preferred. As discussed above, these 
bacteriophage promoters have long been known to be 
useful tools in molecular biology since they can 
provide simple and strong expression systems dependent 

25 only on the binding of the specific or cognate RNA 
polymerase. 

In a still further aspect, the invention provides 
a method for expressing a recombinant polypeptide in 
30 C. elegans, which method comprises: 

introducing an outron expression construct, as 
described above, said construct being an expression 
vector suitable for use in C. elegans, into a C. 
elegans strain which expresses an RNA polymerase 
35 specific for the bacteriophage promoter present in 
said DNA construct in one' or more tissues or cell 
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types . 

An outron expression vector for use in this 
method may be constructed by inserting DNA encoding 
the polypeptide of interest into an outron cloning 
5 vector, as described above. The vector must be one 

which is suitable for use in C. elegans, plasmid-based 
vectors are the most preferred. 

The C. elegans worms are preferably transgenic 
worms carrying a transgene capable of expressing the 

10 RNA polymerase in one or more tissues or cell types. 
The term "transgene capable of expressing" as used 
herein means a nucleic acid molecule comprising a 
nucleotide sequence encoding the polymerase operably 
linked to a promoter. The promoter may be any 

15 promoter which functions in C, elegans and may be 

general (i.e. active in substantially all tissues and 
cell types) , tissue-specif ic, cell type-specific, 
constitutive, inducible etc. Most preferably, the 
promoter will exhibit tissue or cell type-specificity. 

20 With the use of a tissue or cell type-specific 

promoter of the appropriate' specificity it is possible 
to control the site of RNA polymerase expression 
within C. elegans and hence control the site of 
expression of the recombinant polypeptide. 

25 Methods for the construction of transgenic C. 

elegans worms are known in the art and are 
particularly described by Craig Mello and Andrew Fire, 
Methods in Cell Biology, Vol 48, Ed. H.F. Epstein and 
D.C. Shakes, Academic Press, pages 452-480. 

30 

In a further aspect the invention provides a kit 
for use in recombinant expression of a polypeptide in 
C. elegans, the kit comprising an outron cloning 
construct, as described above, and optionally a supply 
35 of C. elegans nematode worms expressing an RNA 

polymerase specific for the bacteriophage promoter 
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present in the said outron cloning construct in one or 
more tissues or cell types. 

The kit might further contain control inserts and 
control constructs, e.g. a reporter gene inserts and 
5 constructs which could be used to check efficiency of 
cloning steps and transfection steps, respectively. 
It might also contain constructs which may be used as 
selectable markers in the transfection procedure, e.g. 
a rol 6 plasmid (see below) . 

10 The invention further provides methods for the 

construction of transgenic C. elegans expressing a 
recombinant polypeptide in one or more tissues or cell 
types . One such method comprises introducing an 
outron expression construct, as described above, said 

15 construct being an expression vector suitable for use 
in C. elegans comprising an open reading frame 
encoding the desired recombinant polypeptide, into a 
C. elegans strain which expresses an RNA polymerase 
specific for the bacteriophage promoter present in 

20 said DNA construct in one or more tissues or cell 

types, and isolating transgenic C. elegans lines which 
stably express the said polypeptide. The C. elegans 
strain expressing the polymerase is preferably a 
transgenic strain carrying a transgene capable of 

25 expressing the RNA polymerase in one or more tissues 
or cell types, as described above. As aforesaid, 
transgenic C. elegans lines can readily be constructed 
using standard techniques well known in the art. 
In an alternative approach, the method may 

30 comprise introducing into a background C. elegans 
strain (i) an outron expression construct, as 
described above, said construct being an expression 
vector suitable for use in C. elegans comprising an 
open frame encoding the desired recombinant 

35 polypeptide, and (ii) a DNA construct suitable for 
expression of an RNA polymerase specific for the 
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bacteriophage promoter present in the outron 
expression construct in one or more tissues or cell 
types of C, elegans, and isolating transgenic C. 
elegans lines which stably express the said 
5 polypeptide. The second DNA construct may, 

advantageously, be an expression vector comprising a 
nucleotide sequence encoding the polymerase operably 
linked to a promoter having the appropriate tissue or 
cell type specificity. 

10 In carrying out the methods of the invention one 

may employ standard techniques well known in the art 
for construction and selection of transgenic C. 
elegans lines. Such techniques are described, for 
example, in techniques described in Methods in Cell 

15 Biology, vol 84; Caenorhabditis elegans: modern 

biological analysis of an organism, ed. Epstein and 
Shakes, academic press, 1995. Foreign DNA (e.g. 
plasmid DNA) may be introduced into C. elegans using 
microinjection or ballistic transformation, as 

20 described in the applicant's co-pending International 
patent application No. WO 99/49066. In order to 
facilitate the selection of transgenic strains a 
marker plasmid may be co-introduced with the 
transgenes. A typical example is the plasmid pRF4 

25 (Mello, C. C. et al. EMBO J. 10, 3959-3970 (1991)) 
which carries the rol-6 gene. C. elegans expressing 
rol-6 can be identified by screening for the roller 
phenotype. Any other C. elegans dominant selectable 
phenotypic marker, of which there are many known in 

30 the art, may be used to facilitate selection of 
transgenic lines. A useful example is green 
fluorescent protein (or any of the equivalent 
autonomous fluorescent proteins known in the art) . 

In a still further aspect the invention provides 

35 transgenic C. elegans worms which contain an outron 
expression construct, as described above, said 
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construct being an expression vector suitable for use 
in C. elegans, and which further express an RNA 
polymerase specific for the bacteriophage promoter 
present in the outron expression construct in one or 
5 more tissues or cell types. 

The present invention will be further understood 
with reference to the following non-limiting Examples, 
together with the accompanying drawings in which: 

10 

Figure 1 illustrates the construction of a T7-outron- 
GFP vector. (A) sequence of the synthetic outron 
produced by annealing oligonucleotides o-GN59 and o- 
GN60. (B) summary of the strategy used to construct 
15 vector pDW3124. 

Figure 2 shows plasmid maps for pDW3123 (outron 
cloning vector) and pDW3124 (outron expression vector 
for GFP expression) . 

20 

Figure 3 is a plasmid map of pGN14 8 which contains a 
T7 RNA polymerase coding sequence under the regulation 
of the C. elegans SERCA promoter. 

25 Figure 4 illustrates the nucleotide sequence of 
pGN14 8. 

Figure 5 illustrates the nucleotide sequence of pDW 
3123 annotated to show the positions of the T7 
30 promoter, outron, synthetic intron A, multi-cloning 
site and unc-54 3' UTR sequences and also the 
ampicillin resistance gene. 

Figure 6 illustrates the nucleotide sequence of pDW 
35 3124 annotated to show the positions of the T7 

promoter, outron, synthetic intron A, GFP with introns 
and unc-54 3' UTR sequences and also the ampicillin 
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resistance gene. 

Example 1 - Construction of a T7-outron-GFP containing 
vector (PDW3124) 
5 A SL1 trans-splice acceptor site (outron) was 

cloned into a vector downstream of the T7 promoter and 
upstream of the GFP to be expressed. 

A synthetic outron consisting of two partially 
overlapping oligonucleotides (o-GN59 and O-GN60, see 

10 Figure 1) was inserted into a Xbal/Xmal digested T7 

promoter GFP construct. Brief ly, 25pl o-GN59 and 25ul 
O-GN60 (lOOpM) were denatured for 5 minutes at 94 °C, 
annealed for 30 minutes at 68°C then cooled to 4°C. 
lul of Xmal/Xbal digested pDW3120 and lOpl of the 

15 annealed oligos were then ligated using T4 ligase 

overnight at 16°C, transformed into competent E. coli 
and analysed by restriction digestion and DNA 
sequencing, all according to standard molecular 
biology procedures. The resulting vector was 

20 designated pDW3124 (Figures 1 and 2) . 

The outron contains an AO* rich sequence followed 
by a splice-acceptor site as described by Conrad et 
al, NAR 21:913-919 (1993) (see Figure 1). 

25 Example 2 - Construction of a T7-Outron MCS vector 
A general purpose vector was constructed to 

facilitate expression of other DNA sequences in C. 

elegans under the control of the T7 promoter. This 

was done by digesting vector pDW3124 with Hindll 
30 (position 179) and PvuII (position 1029) (partial 

digest) and re-ligating the blunt ends, resulting in 

vector pDW3123 (Figure 2) . 
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Example 3 - The expression of heterologous genes in C. 
elegans regulated by the T7 promoter requires 
trans-splicing. 

Wild-type C. elegans nematodes where co-injected 
5 with various combinations of the following test 
plasmids : 

1) GFP reporter plasmid 
GFP: pDW2020 

10 outron-GFP: pDW2024 

T7 promoter-GFP: pDW3120 

T7 promoter-outron-GFP: pDW3124 

2) T7 polymerase expression plasmid SERCA T7 

15 polymerase: pGN148 together with pRF-4 (rol-6) as 
marker. '! 

For every co-injection experiment , a total 
concentration of 200 ng DNA/pl was used (plasmid 
20 concentration was 50 ng/pl and carrier DNA was added 
up to 200ng/pl) . For every co-injection ±15 adult 
worms were injected. 

Fl offspring showing the marker rol-6 phenotype 
25 were isolated and then selected for further study. 
The next generation (F2) of the roller lines were 
screened for GFP expression in the pharynx, vulva, 
tail and body wall muscles. These are the tissues in 
which the bacteriophage T7 RNA polymerase is known to 
30 be expressed when under the control of the C. elegans 
SERCA promoter (as in the construct pGN148) 

The results are shown in Table 1 below, which 
indicates the number of lines expressing GFP vs total 
number of lines isolated. 
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i 
J. 


z 




A 


Construct 


no T7- 

polymerase 

construct 


with T 7 -polymerase 
construct (50ng) 
pGN148 


B 


GFP {50ng) 
pDW2020 


0/8 


2/6* 


C 


outron::GFP (50ng) 
pDW2024 


0/11 


3/8* 


D 


T7-promoter : : GFP (50ng) 
PDW3120 


0/3 


0/5 


E 


T7-promoter : : outron : : GFP { 50ng) 
pDW3124 


0/7 


13/13 



* GFP-expression most probably result of recombination 
10 in the extrachromosomal array 

No GFP expression was observed in the experiments 
where the T7 RNA polymerase was absent (cells B2, C2, 

15 D2, E2) . 

In the experiments where the T7 RNA polymerase 
expressing vector was co-injected with GFP vectors 
without a T7 promoter, as in the cells B3 and C3, GFP 
expression was sometimes observed. This is probably 

20 due to recombination events in the extrachromosomal 
arrays, resulting in transcription of GFP directly 
from the SERCA promoter. 

In the experiments where the T7 promoter-GFP construct 
25 and the SERCA T7 RNA polymerase where co-injected, no 
GFP expression could be observed (cell D3) . In 
contrast, all of the lines isolated from the 
experiments where the GFP transcript contained an 
outron at its 5' site (n=13) expressed GFP (cell E3) . 
30 The outron is a favourable target for SLl 

trans-splicing. Since SLl RNA molecules contain a 5 1 
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trimethylguanosine CAP structure which is transferred 
to the mature mRNA this results in improved 
translation of the RNA and hence better expression of 
GFP. Without the outron the T7 RNA polymerase 
5 transcripts do not carry a CAP structure at their 5' 
end, leading to inefficient translation. The results 
of this experiment illustrate the importance of 
trans-splicing for efficient expression of 
heterologous and homologous genes transcribed by 
10 prokaryotic polymerases in C. elegans. 



SEQUENCE LISTING 



SEQ ID NO: 1 Oligonucleotide o-GN59 

15 SEQ ID NO: 2 Oligonucleotide 0-GN60 

SEQ ID NO: 3 Plasmid pDW3123 

SEQ ID NO: 4 Plasmid pDW3124 

SEQ ID NO: 5 Plasmid pGN148 
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Claims : 

1- A DNA construct comprising a bacteriophage 
promoter operably linked to an outron sequence. 

5 

2. A DNA construct as claimed in claim 1 which 
further comprises at least one restriction enzyme 
recognition site positioned downstream of and proximal 
to the outron sequence. 

10 

3. A DNA construct as claimed in claim 2 which 
comprises a multi-cloning site positioned downstream 
of and proximal to the outron sequence. 

15 4. A DNA construct as claimed in claim 2 or 

claim 3 which further comprises a DNA fragment 
inserted at the said restriction site or at a 
restriction site within the said multi-cloning site. 

20 5. A DNA construct as claimed in any one of 

claims 1 to 4 which is a replicable cloning vector. 

6. A DNA construct as claimed in any one of 
claims 1 to 5 wherein the outron sequence comprises a 

25 3* splice acceptor site having the sequence TTTCAG 
preceded by an AT-rich region. 

7. A DNA construct as claimed in claim 6 
wherein the outron sequence comprises the nucleotide 

30 sequence illustrated in Figure 1A. 

8. A DNA construct as claimed in any one of 
claims 1 to 7 wherein the bacteriophage promoter is 
the T7, T3 or SP6 promoter. 

35 

9. A DNA construct for use in bacteriophage 
promoter-driven expression of a polypeptide in a 
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eukaryotic host cell or organism, which construct 
comprises a bacteriophage promoter operably linked to 
a DNA sequence such that it is capable of initiating 
transcription of said DNA sequence upon binding of the 
5 appropriate RNA polymerase to the promoter, wherein 

the said DNA sequence comprises an outron sequence and 
at least one open reading frame positioned downstream 
of the outron sequence. 

10 10 • A DNA construct as claimed in claim 9 which 

is an expression vector. 

11. A DNA construct as claimed in claim 9 or 
claim 10 wherein the outron sequence comprises a 3 f 

15 splice acceptor site having the sequence TTTCAG 
preceded by an AT-rich region. 

i 

12. A DNA construct as claimed in claim 11 
wherein the outron sequence comprises the nucleotide 

20 sequence illustrated in Figure 1A. 

, i 

13. A DNA construct as claimed in any one of 
claims 9 to 12 wherein the bacteriophage promoter is 
the T7, T3 or SP6 promoter. 

25 

14. A kit for use in recombinant expression of a 
polypeptide in C. elegans, the kit comprising a DNA 
construct as claimed in any one of claims 1 to 3, and 
optionally C. elegans worms expressing an RNA 

30 polymerase specific for the bacteriophage promoter 

present in said DNA construct in one or more tissues 
or cell types. 

15. A method for expressing a recombinant 
35 polypeptide in C. elegans which method comprises: 

introducing a DNA construct as claimed in any one 
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of claims 9 to 13, said construct being an expression 
vector suitable for use in C. elegans, into a C. 
elegans strain which expresses an RNA polymerase 
specific for the bacteriophage promoter present in 
5 said DNA construct in one or more tissues or cell 
types . 

16. A method of generating transgenic C. elegans 
expressing a recombinant polypeptide, which method 

10 comprises: 

introducing a DNA construct as claimed in any one 
of claims 9 to 13 comprising an open reading frame 
encoding the recombinant polypeptide, said construct 
• being an expression vector suitable for use in C. 
15 elegans, into a C. elegans strain which expresses an 

RNA polymerase specific for the bacteriophage promoter 
present in said DNA construct in one or more tissues 
or cell types, and 

isolating transgenic C. elegans lines which 
20 stably express the said polypeptide. 

17 . A method of generating transgenic C. elegans 
expressing a recombinant polypeptide, which method 
comprises: 

25 introducing into C. elegans (i) a first DNA 

construct as claimed in any one of claims 9 to 13 
comprising an open reading frame encoding the 
recombinant polypeptide, said construct being an 
expression vector suitable for use in C. elegans, and 

30 (ii) a second DNA construct suitable for expression of 
an RNA polymerase specific for the bacteriophage 
promoter present in the first DNA construct in one or 
more tissues or cell types of C. elegans, and 

isolating transgenic C. elegans lines which 

35 stably express the said polypeptide 
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18. Transgenic C. elegans which contain a DKA 
construct as claimed in any one of claims 9 to 13, 
said construct being an expression vector suitable for 
use in C. elegans, and which further express an RNA 
5 polymerase specific for the bacteriophage promoter 

present in said DNA construct in one or more tissues 
or cell types. 
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Xbal overhang Sspl 3' splice acceptor 

CTAGATTACAACTAATTATACTTATTTGAATATTCAAATTTTCAGAC o - GN5 9 

TAATGTTGATTAATATGAATAAACTTATAAGTTTAAAAGTCTGGGCC o - GN6 0 



Xmal overhang 




GFP with introns 




amp 



unc-54 3' UTF 




Xmal/Xbal digested pDW3 120 
25ul 0-GN59 + 25 ul 0-GN6O (lOOuM) 
Denature oligos 0-GN59 & 0-GN6O 5 min. at 94°C 
Renaturate 30 min. at 68°C, cool to 4°C 

Ligate 1 ul vector + 10 ul oligos with T4 ligase 
Overnight at 16°C 
Transform in E. coli 

Analyse by Restriction Digest and sequencing 



GFP with introns 



Sac! 



unc-54 3* UTF 



amp 
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,T7 promoter 
Outran 



syrrth. intron A 




GFPwithintrons 
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Nucleotide sequence of pGN14 8 

atgactgctccaaagaagaagcgtaaggtaccggtaatgaacacgattaacatcgctaagaacgacttctc 
tgacatcgaactggctgctatcccgttcaacactctggctgaccattacggtgagcgtttagctcggtaag 
tttaaacatctagatactaactaacgattaacatttaaattttcagcgaacagttggcccttgagcatgag 
tcttacgagatgggtgaagcacgcttccgcaagatgtttgagcgtcaacttaaagctggtgaggttgcgga 
taacgctgccgccaagcctctcatcactaccctactccctaagatgattgcacgcatcaacgactggtttg 
aggaagtgaaagctaagcgcggcaagcgcccgacagccttecagttcctgcaagaaatcaagccggaagcc 
gtagcgtacatcaccattaagaccactctggcttgcctaaccagtgctgacaatacaaccgttcaggctgt 
agcaagcgcaatcggtcgggccattgaggacgaggctcgcttcggtcgtatccgtgaccttgaagctaagc 
acttcaagaaaaacgttgaggaacaactcaacaagcgcgtagggcacgtctacaagaaagcatttatgcaa 
gttgtcgaggctgacatgctctctaagggtctactcggtggcgaggcgtggtcttcgtggcataaggaaga 
ctctattcatgtaggagtacgctgcatcgagatgctcattgagtcaaccggagtggttagcttacaccgcc 
aaaatgctggcgtagtaggtcaagactctgagactatcgaactcgcacctgaatacgctgaggctatcgca 
acccgtgcaggtgcgctggctggcatctctccgatgttccaaccttgcgtagttcctcctaagccgtggac 
tggcattactggtggtggctattgggctaacggtcgtcgtcctctggcgctggtgcgtactcacagtaaga 
aagcactgatgcgctacgaagacgtttacatgcctgaggtgtacaaagcgattaacattgcgcaaaacacc 
gcatggaaaatcaacaagaaagtcctagcggtcgccaacgtaatcaccaagtggaagcattgtccggtcga 
ggacatccctgcgattgagcgtgaagaactcccgatgaaaccggaagacatcgacatgaatcctgaggctc 
tcaccgcgtggaaacgtgctgccgctgctgtgtaccgcaaggacaaggctcgcaagtctcgccgtatcagc 
cttgagttcatgcttgagcaagccaataagtttgctaaccataaggccatctggttcccttacaacatgga 
ctggcgcggtcgtgtttacgctgtgtcaatgttcaacccgcaagctaacgatatgaccaaaggactgctta 
cgctggcgaaaggtaaaccaatcggtaaggaaggttactactggctgaaaatccacggtgcaaactgtgcg 
ggtgtcgataaggttccgttccctgagcgcatcaagttcattgaggaaaaccacgagaacatcatggcttg 
cgctaagtctccactggagaacacttggtgggctgagcaagattctccgttctgcttccttgcgttctgct 
ttgagtacgctggggtacagcaccacggcctgagctataactgctcccttccgctggcgtttgacgggtct 
tgctctggcatccagcacttctccgcgatgctccgagatgaggtaggtggtcgcgcggttgtaagtttaaa 
ctctafccctactaactaacgaagcttatttaaattttcagaacttgcttcctagtgaaaccgttcaggaca 
tctacgggattgttgctaagaaagtcaacgagattctacaagcagacgcaatcaatgggaccgataacgaa 
gtagttaccgtgaccgatgagaacactggtgaaatctctgagaaagtcaagctgggcactaaggcactggc 
tggtcaatggctggcttacggtgttactcgcagtgtgactaagcgttcagtcatgacgctggcttacgggt 
ccaaagagttcggcttccgtcaacaagtgctggaagataccattcagccagctattgattccggcaagggt 
ctgatgttcactcagccgaatcaggctgctggatacatggctaagctgatttgggaatctgtgagcgtgac 
ggtggtagctgcggttgaagcaatgaactggcttaagtctgctgctaagctgctggctgctgaggtcaaag 
ataagaagactggagagattcttcgcaagcgttgcgctgtgcattgggtcactccggatggtttccctgtg 
tggcaggaatacaagaagcctattcaaacgcgtttgaacctgatgttcctcggtcagttccgcttacagcc 
taccattaacaccaacaaagatagcgagattgatgcacacaaacaggagtctggtatcgctcctaactttg 
tacacagccaagacggtagccaccttcgtaagactgtagtgtgggcacacgagaagtacggaatcgaatct 
tttgcactgattcacgactccttcggtaccattccggctgacgctgcgaacctgttcaaagcagtgcgcga 
aactatggttgacacatatgagtcttgtgatgtactggctgatttctacgaccagttcgctgaccagttgc 
acgagtctcaattggacaaaatgccagcacttccggctaaaggtaacttgaacctccgtgacatcttagag 
tcggacttcgcgttcgcgtaagaattccaactgagcgccggtcgctaccattaccaacttgtctggtgtca 
aaaataataggggccgctgtcatcagagtaagtttaaactgagttctac'taactaacgagtaatatttaaa 
ttttcagcatctcgcgcccgtgcctctgacttctaagtccaattactcttcaacatccctacatgctcttt 
ctccctgtgctcccaccccctatttttgttattatcaaaaaaacttcttcttaatttctttgttttttagc 
ttcttttaagtcacctctaacaatgaaattgtgtagattcaaaaatagaattaattcgtaataaaaagtcg 
aaaaaaattgtgctccctccccccattaataataattctatcccaaaatctacacaatgttctgtgtacac 
ttcttatgttttttttacttctgataaattttttttgaaacatcatagaaaaaaccgcacacaaaatacct 
tatcatatgttacgtttcagtttatgaccgcaatttttatttcttcgcacgtctgggcctctcatgacgtc 
aaatcatgctcatcgtgaaaaagttttggagtatttttggaatttttcaatcaagtgaaagtttatgaaat 
taattttcctgcttttgctttttgggggtttcccctattgtttgtcaagagtttcgaggacggcgcttttc 
ttgctaaaatcacaagtattgatgagcacgatgcaagaaagatcggaagaaggtttgggtttgaggctcag 
tggaaggtgagtagaagttgataatttgaaagtggagtagtgtctatggggtttttgccttaaatgacaga 
atacattcccaatataccaaacataactgtttcctactagtcggccgtacgggccctttcgtctcgcgcgt 
ttcggtgatgacggtgaaaacctctgacacatgcagctcccggagacggtcacagcttgtctgtaagcgga 
tgccgggagcagacaagcccgtcagggcgcgtcagcgggtgttggcgggtgtcggggctggcttaactatg 
cggcatcagagcagattgtactgagagtgcaccatatgcggtgtgaaataccgcacagatgcgtaaggaga 



WO 01/88114 



5/13 



PCT7EP0 1/05794 



aaataccgcatcaggcggccttaagggcctcgtgatacgcctatttttataggttaatgtcatgataataa 
tggtttcttagacgtcaggtggcacttttcggggaaatgtgcgcggaacccctatttgtttatttttctaa 
atacattcaaatatgtatccgctcatgagacaataaccctgataaatgcttcaataatattgaaaaaggaa 
gagtatgagtattcaacatttccgtgtcgcccttattcccttttttgcggcattttgccttcctgtttttg 
ctcacccagaaacgctggtgaaagtaaaagatgctgaagatcagttgggtgcacgagcgggttaca^cgaa 
ctggatctcaacagcggtaagatccttgagagttttcgccccgaagaacgttttccaatgatgagcacttt 
taaagttctgctatgtggcgcggtattatcccgtattgacgccgggcaagagcaactcggtcgccgcatac 
actattctcagaatgacttggttgagtactcaccagtcacagaaaagcatcttacggatggcatgacagta 
agagaattatgcagtgctgccataaccatgagtgataacactgcggccaacttacttctgaeaacgatcgg 
aggaccgaaggagctaaccgcttttttgcacaacatgggggatcatgtaactcgccttgatcgttgggaac 
cggagctgaatgaagccataccaaacgacgagcgtgacaccacgatgcctgtagcaatggcaacaacgttg 
cgcaaactattaactggcgaactacttactctagcttcccggcaacaattaatagactggatggaggcgga 
taaagttgcaggaccacttctgcgctcggcccttccggctggctggtttattgctgataaatctggagccg 
gtgagcgtgggtctcgcggtatcattgcagcactggggccagatggtaagccctcccgtatcgtagttatc 
tacacgacggggagtcaggcaactatggatgaacgaaatagacagatcgctgagataggtgcctcactgat 
taagcattggtaactgtcagaccaagtttactcatatatactttagattgatttaaaacttcatttttaat 
ttaaaaggatctaggtgaagatcctttttgataatctcatgaccaaaatcccttaacgtgagttttcgttc 
cactgagcgtcagaccccgtagaaaagatcaaaggatcttcttgagatcctttttttctgcgcgtaatctg 
ctgcttgcaaacaaaaaaaccaccgctaccagcggtggtttgtttgccggatcaagagctaccaactcttt 
ttccgaaggtaactggcttcagcagagcgcagataccaaatactgtccttctagtgtagccgtagttaggc 
caccacttcaagaactctgtagcaccgcctacatacctcgctctgctaatcctgttaccagtggctgctgc 
cagtggcgataagtcgtgtcttaccgggttggactcaagacgatagttaccggataaggcgcagcggtcgg 
gctgaacggggggttcgtgcacacagcccagcttggagcgaacgacctacaccgaactgagatacctacag 
cgtgagcattgagaaagcgccacgcttcccgaagggagaaaggcggacaggtatccggtaagcggcagggt 
cggaacaggagagcgcacgagggagcttccagggggaaacgcctggtatctttatagtcctgtcgggtttc 
gccacctctgacttgagcgtcgatttttgtgatgctcgtcaggggggcggagcctatggaaaaacgccagc 
aacgcggcctttttacggttcctggccttttgctggccttttgctcacatgttctttcctgcgttatcccc 
tgattctgtggataaccgtattaccgcctttgagtgagctgataccgctcgccgcagccgaacgaccgagc 
gcagcgagtcagtgagcgaggaagcggaagagcgcccaatacgcaaaccgcctctccccgcgcgttggccg 
attcattaatgcagctggcacgacaggtttcccgactggaaagcgggcagtgagcgcaacgcaattaatgt 
gagttagctcactcattaggcaccccaggctttacactttatgcttccggctcgtaLgttgtgtggaattg 
tgagcggataacaatttcacacaggaaacagctatgaccatgattacgccaagctgtaagtttaaacatga 
tcttactaactaactattctcatttaaattttcagagcttaaaaatggctgaaatcactcacaacgatgga 
tacgctaacaacttggaaatgaaataagcttgcatgcctgcagagcaaaaaaatactgcttttccttgcaa 
aattcggtgctttcttcaaagagaaacttttgaagtcggcgcgagcatttccttcttcgacttctctcttt 
ccgccaaaaagcctagcatttttattgataatttgattacacacactcagagttcttcgacatgataaagt 
gtttcattggcactcgccctaacagtacatgacaagggcggattattatcgatcgatattgaagacaaact 
ccaaatgtgtgctcattttggagccccgtgtggggcagctgctctcaatatattactagggagacgaggag 
ggggaccttatcgaacgtcgcatgagccattctttcttctttatgcactctcttcactctctcacacatta 
atcgattcatagactcccacafctccttgatgaaggtgtgggtttttagctttttttcccgatttgtaaaag 
gaagaggctgacgatgttaggaaaaagagaacggagccgaaaaaacatccgtagtaagtcttccttttaag 
ccgacactttttagacagcattcgccgctagttttgaagtttaaattttaaaaaataaaaattagtttcaa 
ttttttttaattactaaataggcaaaagttttttcaagaactctagaaaaactagctcaattcatgggtac 
tagaaaaattcttgttttaaatttaatatttatcttaagatgtaattacgagaagcttttttgaaaattct 
caattaaaagaatttgccgatttagaataaaagtcttcagaaatgagtaaaagctcaaattagaagtttgt 
ttttaaaggaaaaacacgaaaaaagaacactatttatcttttcctccccgcgtaaaattagttgttgtgat 
aatagtgatccgctgtctacttgcactcggctcttcacaccgtgcttcctctcacttgacccaacaggaaa 
aaaaaacatcacgtctgagacggtgaattgccttatcaagagcgtcgtctctttcacccagtaacaaaaaa 
aatttggtttctttactttatatttatgtaggtcacaaaaaaaaagtgatgcagttttgtgggtcggttgt 
ctccacaccacctccgcctccagcagcacacaatcatcttcgtgtgttctcgacgattccttgtatgccgc 
ggtcgtgaatgcaccacattcgacgcgcaactacacaccacactcactttcggtgguattactacacgtca 
tcgttgttcgtagtctcccgctctttcgtccccactcactcctcattattccccttggcgtattgattttt 
tttaaatggtacaccactcctgacgtttctaccttcttgttttccgtccatttagattctatctggaaatt 
tttttaaaattttaggccacagagttctagttcttgttctaaaagfcctaggtcagacatacattttctatt 
tctcatcaaaaaaaaagttgataaagaaaactggttattcagaaagagtgrgtctcgctgaaattgattca 
aaaaaaaattcccacccctcgcttgtctctcaaaatatgagatcaacggartttttccttctcgattcaat 
tttttgctgcgctctgtctgccaaagtgtgtgtgtccgagcaaaagatgagagaattcacaaacagaaatg 
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aaaaaaagttggccaaataatgaagttttatccgagattgatgggaaagatattaatgttctttacggttt 
ggaggggagagagagatagattttcgcatcaaactccgccttttacatgtcttttagaatctaaaatagat 
ttttctcatcatttttaatagaaaatcgagaaattacagtaatttcgcaattttcttgccaaaaatacacg 
aaatttgtgggtctcgccacgatctcggtcttagtggttcatttggtttaaaagtttataaaatttcaaat 
tctagtgtttaatttccgcataattggacctaaaatgggtttttgtcatcattttcaacaagaaatcgtga 
aaatcctgttgtttcgcaattttcttttcaaaaatacacgaaatatatggtaatttcccgaaatattgagg 
gtctcgccacgatttcagtcacagtggccaggatttatcacgaaaaaagttcgcctagtctcacatttccg 
gaaaaccgaatctaaattagttttttgtcatcattttgaacaaaaaatcgagacatccctatagtttcgca 
attttcgtcgcttttctctccaaaaatgacagtctagaattaaaattcgctggaactgggaccatgatatc 
ttttctccccgtttttcattttatttt'ttattacactggattgactaaaggtcaccaccaccgccagtgtg 
tgccatatcacacacacacacacacacaatgtcgagattttatgtgttatccctgcttgatttcgttccgt 
tgtctctctctctctattcatcttttgagccgagaagctccagagaatggagcacacaggatcccggcgcg 
cgatgtcgtcgggagatggcgccgcctgggaagccgccgagagatatcagggaagatcgtctgatttctcc 
tcggatgccacctcatctctcgagtttctccgcctgttactccctgccgaacctgatatttcccgtfcgtcg 
taaagagatgtttttattttactttacaccgggtcctctctctctgccagcacagctcagtgttgcctgtg 
tgctcgggctcctgccaccggcggcctcatcttcttcttcttcttctctcctgctctcgcttatcacttct 
tcattcattcttattccttttcatcatcaaactagcatttcttactttatttatttttttcaattttcaat 
tttcagataaaaccaaactacttgggttacagccgtcaacagatccccgggattggccaaaggacccaaag 
gtatgtttcgaatgatactaacataacatagaacattttcaggaggacccttgcttggagggtaccgagct 
cagaaaaa 
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T7 promoter Outron 

AGCTTGGCGC CTAATACGAC TCACTATAGG GCTGCAGGTC GACTCTAGAT TACAACTAAT TATACTTATT 

"TCGAACCGCG GATTATGCTG AGTGATATCC CGACGTCCAG CTGAGATCTA ATGTTGATTA ATATGAATAA 

Outron synth. intron A 



71 TGAATATTCA AATTTTCAGA CCCGGGATTG GCCAAAGGAC CCAAAGGTAT GTTTCGAATG ATACTAACAT 
ACTTATAAGT TTAAAAGTCT GGGCCCTAAC CGGTTTCCTG GGTTTCCATA CAAAGCTTAC TATGATTGTA 

synth. intron A MCS 

141 AACATAGAAC ATTTTCAGGA GGACCCTTGG CTAGCGTCCT GCTGGGATTA CACATGGCAT GGATGAACTA 
TTGTATCTTG TAAAAGTCCT CCTGGGAACC GATCGCAGGA CGACCCTAAT GTGTACCGTA CCTACTTGAT 

unc-54 3' UTR 



•211 TACAAATAGG GCCGGCCGAG CTCCGCATCG GCCGCTGTCA TCAGATCGCC ATCTCGCGCC CGTGCCTCTG 
ATGTTTATCC CGGCCGGCTC GAGGCGTAGC CGGCGACAGT AGTCTAGCGG TAGAGCGCGG GCACGGAGAC 

unc-54 3' UTR 



281 ACTTCTAAGT CCAATTACTC TTCAACATCC CTACATGCTC TTTCTCCCTG TGCTCCCACC CCCTATTTTT 
TGAAGATTCA GGTTAATGAG AAGTTGTAGG GAT GT AC GAG AAAGAGGGAC ACGAGGGTGG GGGATAAAAA 



unc-54 3* DTR 



351 GTTATTATCA AAAAAACTTC TTCTTAATTT CTTTGTTTTT TAGCTTCTTT TAAGTCACCT CTAACAATGA 
CAATAATAGT TTTTTTGAAG AAGAATTAAA GAAACAAAAA ATCGAAGAAA ATTCAGTGGA GATTGTTACT 



unc-54 3' UTR 



421 AATTGTGTAG ATTCAAAAAT AGAATTAATT CGTAATAAAA AGTCGAAAAA AATTGTGCTC CCTCCCCCCA 
TTAACACATC TAAGTTTTTA TCTTAATTAA GCATTATTTT TCAGCTTTTT TTAACACGAG GGAGGGGGGT 

unc-54 3» UTR 



4 91 TTAATAATAA TTCTATCCCA AAATCTACAC AATGTTCTGT GTACACTTCT TATGTTTTTT TTACTTCTGA 
AATTATTATT AAGATAGGGT TTTAGATGTG TTACAAGACA CATGTGAAGA ATACAAAAAA AATGAAGACT 



.unc-54 3 l UTR 



561 TAAATTTTTT TTGAAACATC ATAGAAAAAA CCGCACACAA AATACCTTAT CATATGTTAC GTTTCAGTTT 
ATTTAAAAAA AACTTTGTAG TATCTTTTTT GGCGTGTGTT TTATGGAATA GTATACAATG C A AAGT C AAA 



unc-54 3' UTR 



631 ATGACCGCAA TTTTTATTTC TTCGCACGTC TGGGCCTCTC ATGACGTCAA ATCATGCTCA TCGTGAAAAA 
TACTGGCGTT AAAAATAAAG AAGCGTGCAG ACCCGGAGAG TACTGCAGTT TAGTACGAGT AGCACTTTTT 



unc-54 3' UTR 



701 GTTTTGGAGT ATTTTTGGAA TTTTTCAATC AAGTGAAAGT TTATGAAATT AATTTTCCTG CTTTTGCTTT 
CAAAACCTCA TAAAAACCTT AAAAAGTTAG TTCACTTTCA AATACTTTAA TTAAAAGGAC GAAAACGAAA 



unc-54 3* UTR 



771 TTGGGGGTTT CCCCTATTGT TTGTCAAGAG TTTCGAGGAC GGCGTTTTTC TTGCTAAAAT CACAAGTATT 
AACCCCCAAA GGGGATAACA AACAGTTCTC AAAGCTCCTG CCGCAAAAAG AACGATTTTA GTGTTCATAA 



unc-54 3' UTR 



841 GATGAGCACG ATGCAAGAAA GATCGGAAGA AGGTTTGGGT TTGAGGCTCA GTGGAAGGTG AGTAGAAGTT 
CTACTCGTGC TACGTTCTTT CTAGCCTTCT TCCAAACCCA AACTCCGAGT CACCTTCCAC TCATCTTCAA 



unc-54 3' UTR 



911 GATAATTTGA AAGTGGAGTA GTGTCTATGG GGTTTTTGCC TTAAATGACA GAATACATTC CCAATATACC 
CTATTAAACT -TTCACCTCAT CACAGATACC CCAAAAACGG AATTTACTGT CTTATGTAAG GGTTATATGG 



unc-54 3 ! UTR 



981 AAACATAACT GTTTCCTACT AGTCGGCCGT ACGGGCCCTT TCGTCTCGCG CGTTTCGGTG ATGACGGTGA 
TTTGTATTGA CAAAGGATGA TCAGCCGGCA TGCCCGGGAA AGCAGAGCGC GCAAAGCCAC TACTGCCACT 
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1051 AAACCTCTGA CACATGCAGC TCCCGGAGAC GGTCACAGCT TGTCTGTAAG CGGATGCCGG GAGCAGACAA 
TTTGGAGACT GTGTACGTCG AGGGCCTCTG CCAGTGTCGA ACAGACATTC GCCTACGGCC CTCGTCTGTT 

1121 GCCCGTCAGG GCGCGTCAGC GGGTGTTGGC GGGTGTCGGG GCTGGCTTAA CTATGCGGCA TCAGAGCAGA 
CGGGCAGTCC CGCGCAGTCG CCCACAACCG CCCACAGCCC CGACCGAATT GATACGCCGT AGTCTCGTCT 

1191 TTGTACTGAG AGTGCACCAT ATGCGGTGTG AAATACCGCA CAGATGCGTA AGGAGAAAAT ACCGCATCAG 
AACATGACTC TCACGTGGTA TACGCCACAC TTTATGGCGT GTCTACGCAT TCCTCTTTTA TGGCGTAGTC 

1261 GCGGCCTTAA GGGCCTCGTG ATACGCCTAT TTTTATAGGT TAATGTCATG ATAATAATGG TTTCTTAGAC 
CGCCGGAATT CCCGGAGCAC TATGCGGATA AAAATATCCA ATTACAGTAC TATTATTACC AAAGAATCTG 

1331 GTCAGGTGGC ACTTTTCGGG GAAATGTGCG CGGAACCCCT ATTTGTTTAT TTTTCTAAAT ACATTCAAAT 
CAGTCCACCG TGAAAAGCCC CTTTACACGC GCCTTGGGGA TAAACAAATA AAAAGATTTA TGTAAGTTTA 

amp 

1401 ATGTATCCGC TCATGAGACA ATAACCCTGA TAAATGCTTC AATAATATTG AAAAAGGAAG AGTATGAGTA 
TACATAGGCG AGTACTCTGT TATTGGGACT ATTTACGAAG TTATTATAAC TTTTTCCTTC TCATACTCAT 

amp 



1471 TTCAACATTT CCGTGTCGCC CTTATTCCCT TTTTTGCGGC ATTTTGCCTT CCTGTTTTTG CTCACCCAGA 
AAGTTGTAAA GGCACAGCGG GAATAAGGGA AAAAACGCCG TAAAACGGAA GGACAAAAAC GAGTGGGTCT 

amp 

1541 AACGCTGGTG AAAGTAAAAG ATGCTGAAGA TCAGTTGGGT GCACGAGTGG GTTACATCGA ACTGGATCTC 
TTGCGACCAC TTTCATTTTC TACGACTTCT AGTCAACCCA CGTGCTCACC CAATGTAGCT TGACCTAGAG 

amp 



1611 AACAGCGGTA AGATCCTTGA GAGTTTTCGC CCCGAAGAAC GTTTTCCAAT GATGAGCACT TTTAAAGTTC 
TTGTCGCCAT TCTAGGAACT CTCAAAAGCG GGGCTTCTTG CAAAAGGTTA CTACTCGTGA AAATTTCAAG 

amp 



1681 TGCTATGTGG CGCGGTATTA TCCCGTATTG ACGCCGGGCA AGAGCAACTC GGTCGCCGCA TACACTATTC 
ACGATACACC GCGCCATAAT AGGGCATAAC TGCGGCCCGT TCTCGTTGAG CCAGCGGCGT ATGTGATAAG 

amp 

a=============sr====^===^s==a^s==a===^3^^^ = ^ == aaaa! =B=^a«ga^ 

1751 TCAGAATGAC TTGGTTGAGT ACTCACCAGT CACAGAAAAG CATCTTACGG ATGGCATGAC AGTAAGAGAA 
AGTCTTACTG AACCAACTCA TGAGTGGTCA GTGTCTTTTC GTAGAATGCC TACCGTACTG TCATTCTCTT 

amp 



1821 TTATGCAGTG CTGCCATAAC CATGAGTGAT AACACTGCGG CCAACTTACT TCTGACAACG ATCGGAGGAC 
AATACGTCAC GACGGTATTG GTACTCACTA TTGTGACGCC GGTTGAATGA AGACTGTTGC TAGCCTCCTG 

amp 



1891 CGAAGGAGCT AACCGCTTTT TTGCACAACA TGGGGGATCA TGTAACTCGC CTTGATCGTT GGGAACCGGA 
GCTTCCTCGA TTGGCGAAAA AACGTGTTGT ACCCCCTAGT ACATTGAGCG GAACTAGCAA CCCTTGGCCT 

amp 

1961 GCTGAATGAA GCCATACCAA ACGACGAGCG TGACACCACG ATGCCTGTAG CAATGGCAAC AACGTTGCGC 
CGACTTACTT CGGTATGGTT TGCTGCTCGC ACTGTGGTGC. TACGGACATC GTTACCGTTG TTGCAACGCG 
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amp 

2031 AAACTATTAA CTGGCGAACT ACTTACTCTA GCTTCCCGGC AACAATTAAT AGACTGGATG GAGGCGGATA 
TTTGATAATT GACCGCTTGA TGAATGAGAT CGAAGGGCCG TTGTTAATTA TCTGACCTAC CTCCGCCTAT 

amp 



2101 AAGTTGCAGG ACCACTTCTG CGCTCGGCCC TTCCGGCTGG CTGGTTTATT GCTGATAAAT CTGGAGCCGG 
TTCAACGTCC TGGTGAAGAC GCGAGCCGGG AAGGCCGACC GACCAAATAA CGACTATTTA GACCTCGGCC 

amp 



2171 TGAGCGTGGG TCTCGCGGTA TCATTGCAGC ACTGGGGCCA GATGGTAAGC CCTCCCGTAT CGTAGTTATC 
ACTCGCACCC AGAGCGCCAT AGTAACGTCG TGACCCCGGT CTACCATTCG GGAGGGCATA GCATCAATAG 

amp 



2241 TACACGACGG GGAGTCAGGC AACTATGGAT GAACGAAATA GACAGATCGC TGAGATAGGT GCCTCACTGA 
ATGTGCTGCC CCTCAGTCCG TTGATACCTA CTTGCTTTAT CTGTCTAGCG ACTCTATCCA CGGAGTGACT 

amp 



2311 TTAAGCATTG GTAACTGTCA GACCAAGTTT ACTCATATAT ACTTTAGATT GATTTAAAAC TTCATTTTTA 
AATTCGTAAC CATTGACAGT CTGGTTCAAA TGAGTATATA TGAAATCTAA CTAAATTTTG AAGTAAAAAT 

2381 ATTTAAAAGG ATCTAGGTGA AGATCCTTTT TGATAATCTC ATGACCAAAA TCCCTTAACG TGAGTTTTCG 
TAAATTTTCC TAGATCCACT TCTAGGAAAA ACTATTAGAG TACTGGTTTT AGGGAATTGC ACTCAAAAGC 

2451 TTCCACTGAG CGTCAGACCC CGTAGAAAAG ATCAAAGGAT CTTCTTGAGA TCCTTTTTTT CTGCGCGTAA 
AAGGTGACTC GCAGTCTGGG GCATCTTTTC TAGTTTCCTA GAAGAACTCT AGGAAAAAAA GACGCGCATT 

2521 TCTGCTGCTT GCAAACAAAA AAACCACCGC TACCAGCGGT GGTTTGTTTG CCGGATCAAG AGCTACCAAC 
AGACGACGAA CGTTTGTTTT TTTGGTGGCG ATGGTCGCCA CCAAACAAAC GGCCTAGTTC TCGATGGTTG 

2591 TCTTTTTCCG AAGGTAACTG GCTTCAGCAG AGCGCAGATA CCAAATACTG TCCTTCTAGT GTAGCCGTAG 
AGAAAAAGGC TTCCATTGAC CGAAGTCGTC TCGCGTCTAT GGTTTATGAC AGGAAGATCA CATCGGCATC 

2661 TTAGGCCACC ACTTCAAGAA CTCTGTAGCA CCGCCTACAT ACCTCGCTCT GCTAATCCTG TTACCAGTGG 
AATCCGGTGG TGAAGTTCTT GAGACATCGT GGCGGATGTA TGGAGCGAGA CGATTAGGAC AATGGTCACC 

2731 CTGCTGCCAG TGGCGATAAG TCGTGTCTTA CCGGGTTGGA CTCAAGACGA TAGTTACCGG AT AAGGCGC A 
GACGACGGTC ACCGCTATTC AGCACAGAAT GGCCCAACCT GAGTTCTGCT ATCAATGGCC TATTCCGCGT 

28 01 GCGGTCGGGC TGAACGGGGG GTTCGTGCAC ACAGCCCAGC TTGGAGCGAA CGACCTACAC CGAACTGAGA 
CGCCAGCCCG ACTTGCCCCC CAAGCACGTG TGTCGGGTCG AACCTCGCTT GCTGGATGTG GCTTGACTCT 

2871 TACCTACAGC GTGAGCATTG AGAAAGCGCC ACGCTTCCCG AAGGGAGAAA GGCGGACAGG TATCCGGTAA 
ATGGATGTCG CACTCGTAAC TCTTTCGCGG TGCGAAGGGC TTCCCTCTTT CCGCCTGTCC ATAGGCCATT 

2941 GCGGCAGGGT CGGAACAGGA GAGCGCACGA GGGAGCTTCC AGGGGGAAAC GCCTGGTATC TTTATAGTCC 
CGCCGTCCCA GCCTTGTCCT CTCGCGTGCT CCCTCGAAGG TCCCCCTTTG CGGACCATAG AAATATCAGG 

3011 TGTCGGGTTT CGCCACCTCT GACTTGAGCG TCGATTTTTG TGATGCTCGT CAGGGGGGCG GAGCCTATGG 
ACAGCCCAAA GCGGTGGAGA CTGAACTCGC AGCTAAAAAC ACTACGAGCA GTCCCCCCGC CTCGGATACC 

3081 AAAAACGCCA GCAACGCGGC CTTTTTACGG TTCCTGGCCT TTTGCTGGCC TTTTGCTCAC ATGTTCTTTC 
TTTTTGCGGT CGTTGCGCCG GAAAAATGCC AAGGACCGGA AAACGACCGG AAAACGAGTG TACAAGAAAG 

3151 CTGCGTTATC CCCTGATTCT GTGGATAACC GTATTACCGC CTTTGAGTGA GCTGATACCG CTCGCCGCAG 
GACGCAATAG GGGACTAAGA CACCTATTGG CATAATGGCG GAAACTCACT CGACTATGGC GAGCGGCGTC 

3221 CCGAACGACC GAGCGCAGCG AGTCAGTGAG CGAGGAAGCG GAAGAGCGCC CAATACGCAA ACCGCCTCTC 
GGCTTGCTGG CTCGCGTCGC TCAGTCACTC GCTCCTTCGC CTTCTCGCGG GTTATGCGTT TGGCGGAGAG 

3291 CCCGCGCGTT GGCCGATTCA TTAATGCAGC TGGCACGACA GGTTTCCCGA CTGGAAAGCG GGCAGTGAGC 
GGGCGCGCAA CCGGCTAAGT AATTACGTCG ACCGTGCTGT CCAAAGGGCT GACCTTTCGC CCGTCACTCG 

33 61 GCAACGCAAT TAATGTGAGT TAGCTCACTC ATTAGGCACC CCAGGCTTTA CACTTTATGC TTCCGGCTCG 
CGTTGCGTTA ATTACACTCA ATCGAGTGAG TAATCCGTGG GGTCCGAAAT GTGAAATACG AAGGCC G AGC. 

3431 TATGTTGTGT GGAATTGTGA GCGGATAACA ATTTCACACA GGAAACAGCT ATGACCATGA TTACGCCA 
ATACAACACA CCTTAACACT CGCCTATTGT TAAAGTGTGT CCTTTGTCGA TACTGGTACT AATGCGGT 
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Tl promoter Outron 



1 AGCTTGGCGC CTAATACGAC TCACTATAGG GCTGCAGGTC GACTCTAGAT TACAACTAAT TATACTTATT 
TCGAACCGCG GATTATGCTG AGTGATATCC CGACGTCCAG CTGAGATCTA ATGTTGATTA ATATGAATAA 

Outron synth. intron A 

71 TGAATATTCA AATTTTCAGA CCCGGGATTG GCCAAAGGAC CCAAAGGTAT GTTTCGAATG ATACTAACAT 
ACTTATAAGT TTAAAAGTCT GGGCCCTAAC CGGTTTCCTG GGTTTCCATA CAAAGCTTAC TATGATTGTA 

synth. intron A GFP with introns 

141 AACATAGAAC ATTTTCAGGA GGACCCTTGG CTAGCGTCGA CGGTACCATG GGGCGCGCCA TGAGTAAAGG 
TTGTATCTTG TAAAAGTCCT CCTGGGAACC GATCGCAGCT GCCATGGTAC CCCGCGCGGT ACTCATTTCC 

GFP with introns 



211 AGAAGAACTT TTCACTGGAG TTGTCCCAAT TCTTGTTGAA TTAGATGGTG ATGTTAATGG GCACAAATTT 
TCTTCTTGAA AAGTGACCTC AACAGGGTTA AGAAGAACTT AATCTACCAC TACAATTACC CGTGTTTAAA 

GFP with introns 



281 TCTGTCAGTG GAGAGGGTGA AGGTGATGCA ACATACGGAA AACTTACCCT TAAATTTATT TGCACTACTG 
AGACAGTCAC CTCTCCCACT TCCACTACGT TGTATGCCTT TTGAATGGGA ATTTAAATAA ACGTGATGAC 

GFP with introns 



351 GAAAACTACC TGTTCCATGG GTAAGTTTAA ACATATATAT ACTAACTAAC CCTGATTATT TAAATTTTCA 
CTTTTGATGG ACAAGGTACC CATTCAAATT TGTATATATA TGATTGATTG GGACTAATAA ATTTAAAAGT 

GFP with introns 

421 GCCAACACTT GTCACTACTT TCTGTTATGG TGTTCAATGC TTCTCGAGAT ACCCAGATCA TATGAAACGG 
CGGTTGTGAA CAGTGATGAA AGACAATACC ACAAGTTACG AAGAGCTCTA TGGGTCTAGT ATACTTTGCC 



GFP with introns 



491 CATGACTTTT TCAAGAGTGC CATGCCCGAA GGTTATGTAC AGGAAAGAAC TATATTTTTC AAAGATGACG 
GTACTGAAAA AGTTCTCACG GTACGGGCTT CCAATACATG TCCTTTCTTG ATATAAAAAG TTTCTACTGC 

! GFP with introns 



561 GGAACTACAA GACACGTAAG TTTAAACAGT TCGGTACTAA CTAACCATAC ATATTTAAAT TTTCAGGTGC 
CCTTGATGTT CTGTGCATTC AAATTTGTCA AGCCATGATT GATTGGTATG TATAAATTTA AAAGTCCACG 

GFP with introns 



631 TGAAGTCAAG TTTGAAGGTG ATACCCTTGT TAATAGAATC GAGTTAAAAG GTATTGATTT TAAAGAAGAT 
ACTTCAGTTC AAACTTCCAC TATGGGAACA ATTATCTTAG CTCAATTTTC CATAACTAAA ATTTCTTCTA 

GFP with introns 



701 GGAAACATTC TTGGACACAA ATTGGAATAC AACTATAACT CACACAATGT ATACATCATG GCAGACAAAC 
CCTTTGTAAG AACCTGTGTT TAACCTTATG TTGATATTGA GTGTGTTACA TATGTAGTAC CGTCTGTTTG 

GFP with introns 



771 AAAAGAATGG AATCAAAGTT GTAAGTTTAA ACTTGGACTT ACTAACTAAC GGATTATATT TAAATTTTCA 
TTTTCTTACC TTAGTTTCAA CATTCAAATT TGAACCTGAA TGATTGATTG CCTAATATAA ATTTAAAAGT 



GFP with introns 



841 GAACTTCAAA ATTAGACACA ACATTGAAGA TGGAAGCGTT CAACTAGCAG ACCATTATCA ACAAAATACT 
CTTGAAGTTT TAATCTGTGT TGTAACTTCT ACCTTCGCAA GTTGATCGTC TGGTAATAGT TGTTTTATGA 

GFP with introns 

911 CCAATTGGCG ATGGCCCTGT CCTTTTACCA GACAACCATT ACCTGTCCAC ACAATCTGCC C TTTCGAAAG 
GGTTAACCGC TACCGGGACA GGAAAATGGT CTGTTGGTAA TGGACAGGTG TGTTAGACGG GAAAGCTTTC 



GFP with introns 



981 ATCCCAACGA AAAGAGAGAC CACATGGTCC TTCTTGAGTT TGTAACAGCT GCTGGGATTA CACATGGCAT 
TAGGGTTGCT TTTCTCTCTG GTGTACCAGG AAGAACTCAA ACATTGTCGA CGACCCTAAT GTGTACCGTA 



NUC 38916 



WO 01/88114 



11/13 



PCT/EP01/05794 



GFP with introns unc-54 3 f OTR 



1051 GGATGAACTA TACAAATAGG GCCGGCCGAG CTCCGCATCG GCCGCTGTCA TCAGATCGCC ATCTCGCGCC 
. CCTACTTGAT ATGTTTATCC CGGCCGGCTC GAGGCGTAGC CGGCGACAGT AGTCTAGCGG TAGAGCGCGG 



unc-54 3' OTR 



1121 CGTGCCTCTG ACTTCTAAGT CCAATTACTC TTCAACATCC CTACATGCTC TTTCTCCCTG TGCTCCCACC 
GCACGGAGAC TGAAGATTCA GGTTAATGAG AAGTTGTAGG GATGTACGAG AAAGAGGGAC ACGAGGGTGG 



unc-54 3' OTR 



1191 CCCTATTTTT GTTATTATCA AAAAAACTTC TTCTTAATTT CTTTGTTTTT TAGCTTCTTT TAAGTCACCT 
GGGATAAAAA CAATAATAGT TTTTTTGAAG AAGAATTAAA GAAACAAAAA AT C G AAG AAA ATTCAGTGGA 

unc-54 3 * UTR 



1261 CTAACAATGA AATTGTGTAG ATTCAAAAAT AGAATTAATT CGTAATAAAA AGTCGAAAAA AATTGTGCTC 
GATTGTTACT TTAACACATC TAAGTTTTTA TCTTAATTAA GCATTATTTT TCAGCTTTTT TTAACACGAG 

unc-54 3' UTR 



1331 CCTCCCCCCA TTAATAATAA TTCTATCCCA AAATCTACAC AATGTTCTGT GTACACTTCT TATGTTTTTT 
GGAGGGGGGT AATTATTATT AAGATAGGGT TTTAGATGTG TTACAAGACA CATGTGAAGA ATACAAAAAA 

unc-54 3' OTR 



1401 TTACTTCTGA TAAATTTTTT TTGAAACATC ATAGAAAAAA CCGCACACAA AATACCTTAT CATATGTTAC 
AATGAAGACT ATTTAAAAAA AACTTTGTAG TATCTTTTTT GGCGTGTGTT TTATGGAATA GTATACAATG 

unc-54 3' UTR 



1471 GTTTCAGTTT ATGACCGCAA TTTTTATTTC TTCGCACGTC TGGGCCTCTC ATGACGTCAA ATCATGCTCA 
CAAAGTCAAA TACTGGCGTT AAAAATAAAG AAGCGTGCAG ACCCGGAGAG TACTGCAGTT TAGTACGAGT 

unc-54 3' UTR 



1541 TCGTGAAAAA GTTTTGGAGT ATTTTTGGAA TTTTTCAATC AAGTGAAAGT TTATGAAATT AATTTTCCTG 
AGCACTTTTT CAAAACCTCA TAAAAACCTT AAAAAGTTAG TTCACTTTCA AATACTTTAA TTAAAAGGAC 



unc-54 3* UTR 



1611 CTTTTGCTTT TTGGGGGTTT CCCCTATTGT TTGTCAAGAG TTTCGAGGAC GGCGTTTTTC TTGCTAAAAT 
GAAAACGAAA AACCCCCAAA GGGGATAACA AACAGTTCTC AAAGCTCCTG CCGCAAAAAG AACGATTTTA 

unc-54 3» UTR 



1681 CACAAGTATT GATGAGCACG ATGCAAGAAA GATCGGAAGA AGGTTTGGGT TTGAGGCTCA GTGGAAGGTG 
GTGTTCATAA CTACTCGTGC TACGTTCTTT CTAGCCTTCT TCCAAACCCA AACTCCGAGT CACCTTCCAC 



unc-54 3' UTR 



1751 AGTAGAAGTT GATAATTTGA AAGTGGAGTA GTGTCTATGG GGTTTTTGCC TTAAATGACA GAATACATTC 
TCATCTTCAA CTATTAAACT TTCACCTCAT CACAGATACC CCAAAAACGG AATTTACTGT CTTATGTAAG 

unc-54 3» UTR 



1821 CCAATATACC AAACATAACT GTTTCCTACT AGTCGGCCGT ACGGGCCCTT TCGTCTCGCG CGTTTCGGTG 
GGTTATATGG TTTGTATTGA CAAAGGATGA TCAGCCGGCA TGCCCGGGAA AGCAGAGCGC GCAAAGCCAC 

1891 ATGACGGTGA AAACCTCTGA CACATGCAGC TCCCGGAGAC GGTCACAGCT TGTCTGTAAG CGGATGCCGG 
TACTGCCACT TTTGGAGACT GTGTACGTCG AGGGCCTCTG CCAGTGTCGA ACAGACATTC GCCTACGGCC 

1961 GAGCAGACAA GCCCGTCAGG GCGCGTCAGC GGGTGTTGGC GGGTGTCGGG GCTGGCTTAA CTATGCGGCA 
CTCGTCTGTT CGGGC AGTCC CGCGCAGTCG CCCACAACCG CCCACAGCCC CGACCGAATT GATACGCCGT 

2031 TCAGAGCAGA TTGTACTGAG AGTGCACCAT ATGCGGTGTG AAATACCGCA CAGATGCGTA AGGAGAAAAT 
AGTCTCGTCT AACATGACTC TCACGTGGTA TACGCCACAC TTTATGGCGT GTCTACGCAT TCCTCTTTTA 

2101 AGCGCA.TCAG GCGGCCTTAA GGGCCTCGTG ATACGCCTAT TTTTATAGGT TAATGTCATG ATAATAATGG 
TGGCGTAGTC CGCCGGAATT CCCGGAGCAC TATGCGGATA AAAATATCCA ATTACAGTAC TATTATTACC 
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2171 TTTCTTAGAC GTCAGGTGGC ACTTTTCGGG GAAATGTGCG CGG AACCCCT ATTTGTTTAT TTTTCTAAAT 
AAAGAATCTG CAGTCCACCG TGAAAAGCCC CTTTACACGC GCCTTGGGGA TAAACAAATA AAAAGATTTA 

2241 ACATTCAAAT ATGTATCCGC TCATGAGACA ATAACCCTGA TAAATGCTTC AATAATATTG AAAAAGGAAG 
TGTAAGTTTA TACATAGGCG AGTACTCTGT TATTGGGACT ATTTACGAAG TTATTATAAC TTTTTCCTTC 



amp 



2311 AGTATGAGTA TTCAACATTT CCGTGTCGCC CTTATTCCCT TTTTTGCGGC ATTTTGCCTT CCTGTTTTTG 
TCATACTCAT AAGTTGTAAA GGCACAGCGG GAATAAGGGA AAAAACGCCG TAAAACGGAA GGACAAAAAC 



amp 



2381 CTCACCCAGA AACGCTGGTG AAAGTAAAAG ATGCTGAAGA TCAGTTGGGT GCACGAGTGG GTTACATCGA 
GAGTGGGTCT TTGCGACCAC TTTCATTTTC TACGACTTCT AGTCAACCCA CGTGCTCACC CAATGTAGCT 



amp 



2451 ACTGGATCTC AACAGCGGTA AGATCCTTGA GAGTTTTCGC CCCGAAGAAC GTTTTCCAAT GATGAGCACT 
TGACCTAGAG TTGTCGCCAT TCTAGGAACT CTCAAAAGCG GGGCTTCTTG CAAAAGGTTA CTACTCGTGA 

amp 

2521 TTTAAAGTTC TGCTATGTGG CGCGGTATTA TCCCGTATTG ACGCCGGGCA AGAGCAACTC GGTCGCCGCA 
AAATTTCAAG ACGATACACC GCGCCATAAT AGGGCATAAC TGCGGCCCGT TCTCGTTGAG CCAGCGGCGT 



amp 

===== 

2591 TACACTATTC TCAGAATGAC TTGGTTGAGT ACTCACCAGT CACAGAAAAG CATCTTACGG ATGGCATGAC 
ATGTGATAAG AGTCTTACTG AACCAACTCA TGAGTGGTCA GTGTCTTTTC GTAGAATGCC TACCGTACTG 

amp 



2661 AGTAAGAGAA TTATGCAGTG CTGCCATAAC CATGAGTGAT AACACTGCGG CCAACTTACT TCTGACAACG 
TCATTCTCTT AATACGTCAC GACGGTATTG GTACTCACTA TTGTGACGCC GGTTGAATGA AGACTGTTGC 

amp 



2731 ATCGGAGGAC CGAAGGAGCT AACCGCTTTT TTGCACAACA TGGGGGATCA TGTAACTCGC CTTGATCGTT 
TAGCCTCCTG GCTTCCTCGA TTGGCGAAAA AACGTGTTGT ACCCCCTAGT ACATTGAGCG GAACTAGCAA 



amp 

2801 GGGAACCGGA GCTGAATGAA GCCATACCAA ACGACGAGCG TGACACCACG ATGCCTGTAG CAATGGCAAC 
CCCTTGGCCT CGACTTACTT CGGTATGGTT TGCTGCTCGC ACTGTGGTGC TACGGACATC GTTACCGTTG 



amp 



2871 AACGTTGCGC AAACTATTAA CTGGCGAACT ACTTACTCTA GCTTCCCGGC AACAATTAAT AGACTGGATG 
TTGCAACGCG TTTGATAATT GACCGCTTGA TGAATGAGAT CGAAGGGCCG TTGTTAATTA TCTGACCTAC 

; \ 

amp ' 



2941 GAGGCGGATA AAGTTGCAGG ACCACTTCTG CGCTCGGCCC TTCCGGCTGG CTGGTTTATT GCTGATAAAT 
CTCCGCCTAT TTCAACGTCC TGGTGAAGAC GCGAGCCGGG AAGGCCGACC GACCAAATAA CGACTATTTA 

amp 

3011 CTGGAGCCGG TGAGCGTGGG TCTCGCGGTA TCATTGCAGC ACTGGGGCCA GATGGTAAGC CCTCCCGTAT 
GACCTCGGCC ACTCGCACCC AGAGCGCCAT AGTAACGTCG TGACCCCGGT CTACCATTCG GGAGGGCATA 

amp 



3081 CGTAGTTATC TACACGACGG GGAGTCAGGC AACTATGGAT GAACGAAATA GACAGATCGC TGAGATAGGT 
GCATCAATAG ATGTGCTGCC CCTCAGTCCG TTGATACCTA CTTGCTTTAT CTGTCTAGCG ACTCTATCCA 
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amp 



3151 ' gcctcactga ttaagcattg gtaactgtca gaccaagttt actcatatat actttagatt gatttaaaac 
cggagtgact aattcgtaac cattgacagt ctggttcaaa tgagtatata tgaaatctaa ctaaattttg 

3221 ttcattttta atttaaaagg atctaggtga agatcctttt tgataatctc atgaccaaaa tcccttaacg 
aagtaaaaat taaattttcc tagatccact tctaggaaaa actattagag tactggtttt agggaattgc 

3291 tgagttttcg ttccactgag cgtcagaccc cgtagaaaag atcaaaggat cttcttgaga tccttttttt 
actcaaaagc aaggtgactc gcagtctggg gcatcttttc tagtttccta gaagaactct aggaaaaaaa 

3361 ctgcgcgtaa tctgctgctt gcaaacaaaa aaaccaccgc taccagcggt ggtttgtttg ccggatcaag 
gacgcgcatt agacgacgaa cgtttgtttt tttggtggcg atggtcgcca ccaaacaaac ggcctagttc 

3431 agctaccaac tctttttccg aaggtaactg gcttcagcag agcgcagata ccaaatactg tccttctagt 
tcgatggttg agaaaaaggc ttccattgac cgaagtcgtc tcgcgtctat ggtttatgac aggaagatca 

3501 gtagccgtag ttaggccacc acttcaagaa ctctgtagca ccgcctacat acctcgctct gctaatcctg 
catcggcatc aatccggtgg tgaagttctt gagacatcgt ggcggatgta tggagcgaga cgattaggac 



3571 TTACCAGTGG CTGCTGCCAG TGGCGATAAG TCGTGTCTTA CCGGGTTGGA CTCAAGACGA TAGTTACCGG 
AATGGTCACC GACGACGGTC ACCGCTATTC AGCACAGAAT GGCCCAACCT GAGTTCTGCT ATCAATGGCC 

3641 ATAAGGCGCA GCGGTCGGGC TGAACGGGGG GTTCGTGCAC ACAGCCCAGC TTGGAGCGAA CGACCTACAC 
TATTCCGCGT CGCCAGCCCG ACTTGCCCCC CAAGCACGTG TGTCGGGTCG AACCTCGCTT GCTGGATGTG 

3711 CGAACTGAGA TACCTACAGC GTGAGCATTG AGAAAGCGCC ACGCTTCCCG AAGGGAGAAA GGCGGACAGG 
GCTTGACTCT ATGGATGTCG CACTCGTAAC TCTTTCGCGG TGCGAAGGGC TTCCCTCTTT CCGCCTGTCC 

3781 TATCCGGTAA GCGGCAGGGT CGGAACAGGA GAGCGCACGA GGGAGCTTCC AGGGGGAAAC GCCTGGTATC 
ATAGGCCATT CGCCGTCCCA GCCTTGTCCT CTCGCGTGCT CCCTCGAAGG TCCCCCTTTG CGGACCATAG 

3851 TTTATAGTCC TGTCGGGTTT CGCCACCTCT GACTTGAGCG TCGATTTTTG TGATGCTCGT CAGGGGGGCG 
AAATATCAGG ACAGCCCAAA GCGGTGGAGA CTGAACTCGC AGCTAAAAAC ACTACGAGCA GTCCCCCCGC 

3921 GAGCCTATGG AAAAACGCCA GCAACGCGGC CTTTTTACGG TTCCTGGCCT TTTGCTGGCC TTTTGCTCAC 
CTCGGATACC TTTTTGCGGT CGTTGCGCCG GAAAAATGCC AAGGACCGGA AAACGACCGG AAAAC GAGTG 

3991 ATGTTCTTTC CTGCGTTATC CCCTGATTCT GTGGATAACC GTATTACCGC CTTTGAGTGA GCTGATACCG 
TACAAGAAAG GACGCAATAG GGGACTAAGA CACCTATTGG CATAATGGCG GAAACTCACT CGACTATGGC 

4061 CTCGCCGCAG CCGAACGACC GAGCGCAGCG AGTCAGTGAG CGAGGAAGCG GAAGAGCGCC CAATACGCAA 
GAGCGGCGTC GGCTTGCTGG CTCGCGTCGC TCAGTCACTC GCTCCTTCGC CTTCTCGCGG GTTATGCGTT 

4131 ACCGCCTCTC CCCGCGCGTT GGCCGATTCA TTAATGCAGC TGGCACGACA GGTTTCCCGA CTGGAAAGCG 
TGGCGGAGAG GGGCGCGCAA CCGGCTAAGT AATTACGTCG ACCGTGCTGT CCAAAGGGCT GACCTTTCGC 

4201 GGCAGTGAGC GCAACGCAAT TAATGTGAGT TAGCTCACTC ATTAGGCACC CCAGGCTTTA CACTTTATGC 
CCGTCACTCG CGTTGCGTTA ATTACACTCA ATCGAGTGAG TAATCCGTGG GGTCCGAAAT GTGAAATACG 

4271 TTCCGGCTCG TATGTTGTGT GGAATTGTGA GCGGATAACA ATTTCACACA GGAAACAGCT ATGACCATGA 
AAGGCCGAGC ATACAACACA CCTTAACACT CGCCTATTGT TAAAGTGTGT CCTTTGTCGA TACTGGTACT 

4341 TTACGCCA 
AATGCGGT 
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SEQUENCE LISTING 



<110> DEVGEN NV 



<120> GENE EXPRESSION SYSTEM 



<130> SCB/'55177/001 



<140> 
<141> 



<160> 5 



<170> Patentln Ver. 2.0 



<210> 1 
<211> 47 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: 
oligonucleotide o-GN59 



<400> 1 

ctagattaca actaattata cttatttgaa tattcaaatt ttcagac 47 

<210> 2 
<211> 47 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence; 
oligonucleotide O-GN60 

<400> 2 

ccgggtctga aaatttgaat attcaaataa gtataattag ttgtaat 47 



<210> 3 
<211> 3498 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: plasmid 
pDW3123 



<400> 3 

agcttggcgc ctaatacgac tcactatagg gctgcaggtc gactctagat tacaactaat 60 

tatacttatt tgaatattca aattttcaga cccgggattg gccaaaggac ccaaaggtat 120 

gtttcgaatg atactaacat aacatagaac attttcagga ggacccttgg ctagcgtcct 180 

gctgggatta cacatggcat ggatgaacta tacaaatagg gccggccgag ctccgcatcg 240 

gccgctgtca tcagatcgcc atctcgcgcc cgtgcctctg acttctaagt ccaattactc 300 

ttcaacatcc ctacatgctc tttctccctg tgctcccacc ccctattttt gttattatca 360 

aaaaaacttc ttcttaattt ctttgttttt tagcttcttt taagtcacct ctaacaatga 420 

aattgtgtag attcaaaaat agaattaatt cgtaataaaa agtcgaaaaa aattgtgctc 480 

cctcccccca ttaataataa ttctatccca aaatctacac aatgttctgt gtacacttct 540 

tatgtttttt ttacttctga taaatttttt ttgaaacatc ata'gaaaaaa ccgcacacaa 600 

aataccttat catatgttac gtttcagttt atgaccgcaa tttttatttc ttcgcacgtc 660 

tgggcctctc atgacgtcaa atcatgctca tcgtgaaaaa gttttggagt atttttggaa 720 
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tttttcaatc aagtgaaagt ttatgaaatt aattttcctg cttttgcttt ttgggggttt 780 

cccctattgt ttgtcaagag tttcgaggac ggcgtttttc ttgctaaaat cacaagtatt 840 

gatgagcacg atgcaagaaa gatcggaaga aggtttgggt ttgaggctca gtggaaggtg 900 

agtagaagtt gataatttga aagtggagta gtgtctatgg ggtttttgcc ttaaatgaca 960 

gaatacattc ccaatatacc aaacataact gtttcctact agtcggccgt acgggccctt 1020 

tcgtctcgcg. cgtttcggtg atgacggtga aaacctctga cacatgcagc tcccggagac 1080 

ggtcacagct tgtctgtaag cggatgccgg gagcagacaa gcccgtcagg gcgcgtcagc 114 0 

gggtgttggc gggtgtcggg gctggcttaa ctatgcggca tcagagcaga ttgtactgag 1200 

agtgcaccat atgcggtgtg aaataccgca cagatgcgta aggagaaaat accgcatcag 1260 

gcggccttaa gggcctcgtg atacgcctat ttttataggt taatgtcatg ataataatgg 1320 

tttcttagac gtcaggtggc acttttcggg gaaatgtgcg cggaacccct atttgtttat 1380 

ttttctaaat acattcaaat atgtatccgc tcatgagaca ataaccctga taaatgcttc 144 0 

aataatattg aaaaaggaag agtatgagta ttcaacattt ccgtgtcgcc cttattccct 1500 

tttttgcggc attttgcctt cctgtttttg ctcacccaga aacgctggtg aaagtaaaag 1560 

atgctgaaga tcagttgggt gcacgagtgg gttacatcga actggatctc aacagcggta 1620 

agatccttga gagttttcgc cccgaagaac gttttccaat gatgagcact tttaaagttc 1680 

tgctatgtgg cgcggtatta tcccgtattg acgccgggca agagcaactc ggtcgccgca 1740 

tacactattc tcagaatgac ttggttgagt actcaccagt cacagaaaag catcttacgg 1800 

atggcatgac agtaagagaa ttatgcagtg ctgccataac catgagtgat aacactgcgg 1860 

ccaacttact tctgacaacg atcggaggac cgaaggagct aaccgctttt ttgcacaaca 1920 

tgggggatca tgtaactcgc cttgatcgtt gggaaccgga gctgaatgaa gccataccaa 1980 

acgacgagcg tgacaccacg atgcctgtag caatggcaac aacgttgcgc aaactattaa 2040 

ctggcgaact acttactcta gcttcccggc aacaattaat agactggatg gaggcggata 2100 

aagttgcagg accacttctg cgctcggccc ttccggctgg ctggtttatt gctgataaat 2160 

ctggagccgg tgagcgtggg tctcgcggta tcattgcagc actggggcca gatggtaagc 2220 

cctcccgtat cgtagttatc tacacgacgg ggagtcaggc aactatggat gaacgaaata 2280 

gacagatcgc tgagataggt gcctcactga ttaagcattg gtaactgtca gaccaagttt 2340 

actcatatat actttagatt gatttaaaac ttcattttta atttaaaagg atctaggtga 2400 

agatcctttt tgataatctc atgaccaaaa tcccttaacg tgagttttcg ttccactgag 2460 

cgtcagaccc cgtagaaaag atcaaaggat cttcttgaga tccttttttt ctgcgcgtaa 2520 

tctgctgctt gcaaacaaaa aaaccaccgc taccagcggt ggtttgtttg ccggatcaag 2580 

agctaccaac tctttttccg aaggtaactg gcttcagcag agcgcagata ccaaatactg 264 0 

tccttctagt gtagccgtag ttaggccacc acttcaagaa ctctgtagca ccgcctacat 2700 

acctcgctct gctaatcctg ttaccagtgg ctgctgccag tggcgataag tcgtgtctta 27 60 

ccgggttgga ctcaagacga tagttaccgg ataaggcgca gcggtcgggc tgaacggggg 2820 

gttcgtgcac acagcccagc ttggagcgaa cgacctacac cgaactgaga tacctacagc 2880 

gtgagcattg agaaagcgcc acgcttcccg aagggagaaa ggcggacagg tatccggtaa 294 0 

gcggcagggt cggaacagga gagcgcacga gggagcttcc agggggaaac gcctggtatc 3000 

tttatagtcc tgtcgggttt cgccacctct gacttgagcg tcgatttttg tgatgctcgt 3060 

caggggggcg gagcctatgg aaaaacgcca gcaacgcggc ctttttacgg ttcctggcct 3120 

tttgctggcc ttttgctcac atgttctttc ctgcgttatc ccctgattct gtggataacc 3180 

gtattaccgc ctttgagtga gctgataccg ctcgccgcag ccgaacgacc gagcgcagcg 3240 

agtcagtgag cgaggaagcg gaagagcgcc caatacgcaa accgcctctc cccgcgcgtt 3300 

ggccgattca ttaatgcagc tggcacgaca ggtttcccga ctggaaagcg ggcagtgagc 3360 

gcaacgcaat taatgtgagt tagctcactc attaggcacc ccaggcttta cactttatgc 3420 

ttccggctcg tatgttgtgt ggaattgtga gcggataaca atttcacaca ggaaacagct 3480 
atgaccatga ttacgcca 3498 

<210> 4 
<211> 4348 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: plasmid 
PDW3124 

<400> 4 

agcttggcgc ctaatacgac tcactatagg gctgcaggtc gactctagat tacaactaat 60 

tatacttatt tgaatattca aattttcaga cccgggattg gccaaaggac ccaaaggtat 120 

gtttcgaatg atactaacat aacatagaac attttcagga ggacccttgg ctagcgtcga 180 
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cggtaccatg gggcgcgcca tgagtaaagg agaagaactt ttcactggag ttgtcccaat 240 

tcttgttgaa ttagatggtg atgttaatgg gcacaaattt tctgtcagtg gagagggtga 300 

aggtgatgca acatacggaa aacttaccct taaatttatt tgcactactg gaaaactacc 360 

tgttccatgg gtaagtttaa acatatatat actaactaac cctgattatt taaattttca 420 

gccaacactt gtcactactt tctgttatgg tgttcaatgc ttctcgagat acccagatca 480 

tatgaaacgg catgactttt tcaagagtgc catgcccgaa ggttatgtac aggaaagaac 540 

tatatttttc aaagatgacg ggaactacaa gacacgtaag tttaaacagt tcggtactaa 600 

ctaaccatac atatttaaat tttcaggtgc tgaagtcaag tttgaaggtg atacccttgt 660 

taatagaatc gagttaaaag gtattgattt taaagaagat ggaaacattc ttggacacaa 720 

attggaatac aactataact cacacaatgt atacatcatg gcagacaaac aaaagaatgg 780 

aatcaaagtt gtaagtttaa acttggactt actaactaac ggattatatt taaattttca 84 0 

gaacttcaaa attagacaca acattgaaga tggaagcgtt caactagcag accattatca 900 

acaaaatact ccaattggcg atggccctgt ccttttacca gacaaccatt acctgtccac 960 

acaatctgcc ctttcgaaag atcccaacga aaagagagac cacatggtcc ttcttgagtt 1020 

tgtaacagct gctgggatta cacatggcat ggatgaacta tacaaatagg gccggccgag 1080 

ctccgcatcg gccgctgtca tcagatcgcc atctcgcgcc cgtgcctctg acttctaagt 1140 

ccaattactc ttcaacatcc ctacatgctc tttctccctg tgctcccacc ccctattttt 1200 

gttattatca aaaaaacttc ttcttaattt ctttgttttt tagcttcttt taagtcacct 1260 

ctaacaatga aattgtgtag attcaaaaat agaattaatt cgtaataaaa agtcgaaaaa 1320 

aattgtgctc cctcccccca ttaataataa ttctatccca aaatctacac aatgttctgt 1380 

gtacacttct tatgtttttt ttacttctga taaatttttt ttgaaacatc atagaaaaaa 1440 

ccgcacacaa aataccttat catatgttac gtttcagttt atgaccgcaa tttttatttc 1500 

ttcgcacgtc tgggcctctc atgacgtcaa atcatgctca tcgtgaaaaa gttttggagt 1560 

atttttggaa tttttcaatc aagtgaaagt ttatgaaatt aattttcctg cttttgcttt 1620 

ttgggggttt cccctattgt ttgtcaagag tttcgaggac ggcgtttttc ttgctaaaat 1680 

cacaagtatt gatgagcacg atgcaagaaa gatcggaaga aggtttgggt ttgaggctca 1740 

gtggaaggtg agtagaagtt gataatttga aagtggagta gtgtctatgg ggtttttgcc 1800 

ttaaatgaca gaatacattc ccaatatacc aaacataact gtttcctact agtcggccgt 1860 

acgggccctt tcgtctcgcg cgtttcggtg atgacggtga aaacctctga cacatgcagc 1920 

tcccggagac ggtcacagct tgtctgtaag cggatgccgg gagcagacaa gcccgtcagg 1980 

gcgcgtcagc gggtgttggc gggtgtcggg gctggcttaa ctatgcggca tcagagcaga 2040 

ttgtactgag agtgcaccat atgcggtgtg aaataccgca cagatgcgta aggagaaaat 2100 

accgcatcag gcggccttaa gggcctcgtg atacgcctat ttttataggt taatgtcatg 2160 

ataataatgg tttcttagac gtcaggtggc acttttcggg gaaatgtgcg cggaacccct 2220 

atttgtttat ttttctaaat acattcaaat atgtatccgc tcatgagaca ataaccctga 2280 

taaatgcttc aataatattg aaaaaggaag agtatgagta ttcaacattt ccgtgtcgcc 2340 

cttattccct tttttgcggc attttgcctt cctgtttttg ctcacccaga aacgctggtg 2400 

aaagtaaaag atgctgaaga tcagttgggt gcacgagtgg gttacatcga actggatctc 24 60 

aacagcggta agatccttga gagttttcgc cccgaagaac gttttccaat gatgagcact 2520 

tttaaagttc tgctatgtgg cgcggtatta tcccgtattg acgccgggca agagcaactc 2580 

ggtcgccgca tacactattc tcagaatgac ttggttgagt actcaccagt cacagaaaag 2640 

catcttacgg atggcatgac agtaagagaa ttatgcagtg ctgccataac catgagtgat 2700 

aacactgcgg ccaacttact tctgacaacg atcggaggac cgaaggagct aaccgctttt 2760 

ttgcacaaca tgggggatca tgtaactcgc cttgatcgtt gggaaccgga gctgaatgaa 2820 

gccataccaa acgacgagcg tgacaccacg atgcctgtag caatggcaac aacgttgcgc 2880 

aaactattaa ctggcgaact acttactcta gcttcccggc aacaattaat agactggatg 2940 

gaggcggata aagttgcagg accacttctg cgctcggccc ttccggctgg ctggtttatt 3000 

gctgataaat ctggagccgg tgagcgtggg tctcgcggta tcattgcagc actggggcca 3060 

gatggtaagc cctcccgtat cgtagttatc tacacgacgg ggagtcaggc aactatggat 3120 

gaacgaaata gacagatcgc tgagataggt gcctcactga ttaagcattg gtaactgtca 3180 

gaccaagttt actcatatat actttagatt gatttaaaac ttcattttta atttaaaagg 3240 

atctaggtga agatcctttt tgataatctc atgaccaaaa tcccttaacg tgagttttcg 3300 

ttccactgag cgtcagaccc cgtagaaaag atcaaaggat cttcttgaga tccttttttt 3360 

ctgcgcgtaa tctgctgctt gcaaacaaaa aaaccaccgc taccagcggt ggtttgtttg 3420 

ccggatcaag agctaccaac tctttttccg aaggtaactg gcttcagcag agcgcagata 3480 

ccaaatactg tccttctagt gtagccgtag ttaggccacc acttcaagaa ctctgtagca 3540 

ccgcctacat acctcgctct gctaatcctg ttaccagtgg ctgctgccag tggcgataag 3600 

tcgtgtctta ccgggttgga ctcaagacga tagttaccgg ataaggcgca gcggtcgggc 3660 

tgaacggggg gttcgtgcac acagcccagc ttggagcgaa cgacctacac cgaactgaga 3720 

tacctacagc gtgagcattg agaaagcgcc acgcttcccg aagggagaaa ggcggacagg 3780 

tatccggtaa gcggcagggt cggaacagga gagcgcacga gggagcttcc agggggaaac 3840 
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gcctggtatc tttatagtcc tgtcgggttt cgccacctct gacttgagcg tcgatttttg 3900 

tgatgctcgt caggggggcg gagcctatgg aaaaacgcca gcaacgcggc ctttttacgg 3960 

ttcctggcct tttgctggcc ttttgctcac atgttctttc ctgcgttatc ccctgattct 4020 

gtggataacc gtattaccgc ctttgagtga gctgataccg ctcgccgcag ccgaacgacc 4080 

gagcgcagcg agtcagtgag cgaggaagcg gaagagcgcc caatacgcaa accgcctctc 4140 

cccgcgcgtt ggccgattca ttaatgcagc tggcacgaca ggtttcccga ctggaaagcg 4200 

ggcagtgage gcaacgcaat taatgtgagt tagctcactc attaggcacc ccaggcttta 4260 

cactttatgc ttccggctcg tatgttgtgt ggaattgtga gcggataaca atttcacaca 4320 

ggaaacagct atgaccatga ttacgcca 434 8 

<210> 5 
<211> 9309 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: plasmid pGN14 8 
<400> 5 

atgactgctc caaagaagaa gcgtaaggta ccggtaatga acacgattaa catcgctaag 60 

aacgacttct ctgacatcga actggctgct atcccgttca acactctggc tgaccattac 120 

ggtgagcgtt tagctcggta agtttaaaca tctagatact aactaacgat taacatttaa 180 

attttcagcg aacagttggc ccttgagcat gagtcttacg agatgggtga agcacgcttc 240 

cgcaagatgt ttgagcgtca acttaaagct ggtgaggttg cggataacgc tgccgccaag 300 

cctctcatca ctaccctact ccctaagatg attgcacgca tcaacgactg gtttgaggaa 360 

gtgaaagcta agcgcggcaa gcgcccgaca gccttccagt tcctgcaaga aatcaagccg 420 

gaagccgtag cgtacatcac cattaagacc actctggctt gcctaaccag tgctgacaat 480 

acaaccgttc aggctgtagc aagcgcaatc ggtcgggcca ttgaggacga ggctcgcttc 540 

ggtcgtatcc gtgaccttga agctaagcac ttcaagaaaa acgttgagga acaactcaac 600 

aagcgcgtag ggcacgtcta caagaaagca tttatgcaag ttgtcgaggc tgacatgctc 660 

tctaagggtc tactcggtgg cgaggcgtgg tcttcgtggc ataaggaaga ctctattcat 720 

gtaggagtac gctgcatcga gatgctcatt gagtcaaccg gagtggttag cttacaccgc 780 

caaaatgctg gcgtagtagg tcaagactct gagactatcg aactcgcacc tgaatacgct 840 

gaggctatcg caacccgtgc aggtgcgctg gctggcatct ctccgatgtt ccaaccttgc 900 

gtagttcctc ctaagccgtg gactggcatt actggtggtg gctattgggc taacggtcgt 960 

cgtcctctgg cgctggtgcg tactcacagt aagaaagcac tgatgcgcta • cgaagacgtt 1020 

tacatgcctg aggtgtacaa agcgattaac attgcgcaaa acaccgcatg gaaaatcaac 1080 

aagaaagtcc tagcggtcgc caacgtaatc accaagtgga agcattgtcc ggtcgaggac i!40 

atccctgcga ttgagcgtga agaactcccg atgaaaccgg aagacatcga catgaatcct 1200 

gaggctctca ccgcgtggaa acgtgctgcc gctgctgtgt accgcaagga caaggctcgc 1260 

aagtctcgcc gtatcagcct tgagttcatg cttgagcaag ccaataagtt tgctaaccat 1320 

aaggccatct ggttccctta caacatggac tggcgcggtc gtgtttacgc tgtgtcaatg 1380 

ttcaacccgc aagctaacga tatgaccaaa ggactgctta cgctggcgaa aggtaaacca 14 40 

atcggtaagg aaggttacta ctggctgaaa atccacggtg caaactgtgc gggtgtcgat 1500 

aaggttccgt tccctgagcg catcaagttc attgaggaaa accacgagaa catcatggct 1560 

tgcgctaagt ctccactgga gaacacttgg tgggctgagc aagattctcc gttctgcttc 1620 

cttgcgttct gctttgagta cgctggggta cagcaccacg gcctgagcta taactgctcc 1680 

cttccgctgg cgtttgacgg gtcttgctct ggcatccagc acttctccgc gatgctccga 1740 

gatgaggtag gtggtcgcgc ggttgtaagt ttaaactcta tcctactaac taacgaagct 1800 

tatttaaatt ttcagaactt gcttcctagt gaaaccgttc aggacatcta cgggattgtt 18 60 

gctaagaaag tcaacgagat tctacaagca gacgcaatca atgggaccga taacgaagta 1920 

gttaccgtga ccgatgagaa cactggtgaa atctctgaga aagtcaagct gggcactaag 1980 

gcactggctg gtcaatggct ggcttacggt gttactcgca gtgtgactaa gcgttcagtc 2040 

atgacgctgg cttacgggtc caaagagttc ggcttccgtc aacaagtgct ggaagatacc 2100 

attcagccag ctattgattc cggcaagggt ctgatgttca ctcagccgaa tcaggctgct 2160 

ggatacatgg ctaagctgat ttgggaatct gtgagcgtga cggtggtagc tgcggttgaa 2220 

gcaatgaact ggcttaagtc tgctgctaag ctgctggctg ctgaggtcaa agataagaag 2280 

actggagaga ttcttcgcaa gcgttgcgct gtgcattggg tcactccgga tggtttccct 2340 

gtgtggcagg aatacaagaa gcctattcaa acgcgtttga acctgatgtt cctcggtcag 2400 

ttccgcttac agcctaccat taacaccaac aaagatagcg agattgatgc acacaaacag 24 60 

gagtctggta tcgctcctaa ctttgtacac agccaagacg gtagccacct tcgtaagact 2520 



WO 01/881 14 PCT/EP01/05794 

5 

gtagtgtggg cacacgagaa gtacggaatc gaatcttttg cactgattca cgactccttc 2580 
ggtaccattc cggctgacgc tgcgaacctg ttcaaagcag tgcgcgaaac tatggttgac 2640 
acatatgagt cttgtgatgt actggctgat ttctacgacc agttcgctga ccagttgcac 2700 
gagtctcaat tggacaaaat gccagcactt ccggctaaag gtaacttgaa cctccgtgac 2760 
atcttagagt cggacttcgc gttcgcgtaa gaattccaac tgagcgccgg tcgctaccat 2820 
taccaacttg tctggtgtca aaaataatag gggccgctgt catcagagta agtttaaact 2880 
gagttctact aactaacgag taatatttaa attttcagca tctcgcgccc gtgcctctga 2940 
cttctaagtc caattactct tcaacatccc tacatgctct ttctccctgt gctcccaccc 3000 
cctatttttg ttattatcaa aaaaacttct tcttaatttc tttgtttttt agcttctttt 3060 
aagtcacctc taacaatgaa attgtgtaga ttcaaaaata gaattaattc gtaataaaaa 3120 
gtcgaaaaaa attgtgctcc ctccccccat taataataat tctatcccaa aatctacaca 3180 
atgttctgtg tacacttctt atgttttttt tacttctgat aaattttttt tgaaacatca 3240 
tagaaaaaac cgcacacaaa ataccttatc atatgttacg tttcagttta tgaccgcaat 3300 
ttttatttct tcgcacgtct gggcctctca tgacgtcaaa tcatgctcat cgtgaaaaag 3360 
ttttggagta tttttggaat ttttcaatca agtgaaagtt tatgaaatta attttcctgc 3420 
ttttgctttt tgggggtttc ccctattgtt tgtcaagagt ttcgaggacg gcgtttttct 34 80 
tgctaaaatc acaagtattg atgagcacga tgcaagaaag atcggaagaa ggtttgggtt 3540 
tgaggctcag tggaaggtga gtagaagttg ataatttgaa agtggagtag tgtctatggg 3600 
gtttttgcct taaatgacag aatacattcc caatatacca aacataactg tttcctacta 3660 
gtcggccgta cgggcccttt cgtctcgcgc gtttcggtga tgacggtgaa aacctctgac 3720 
acatgcagct cccggagacg gtcacagctt gtctgtaagc ggatgccggg agcagacaag 3780 
cccgtcaggg cgcgtcagcg ggtgttggcg ggtgtcgggg ctggcttaac tatgcggcat 3840 
cagagcagat tgtactgaga gtgcaccata tgcggtgtga aataccgcac agatgcgtaa 3900 
ggagaaaata ccgcatcagg cggccttaag ggcctcgtga tacgcctatt tttataggtt 3960 
aatgtcatga taataatggt ttcttagacg tcaggtggca cttttcgggg aaatgtgcgc 4020 
ggaaccccta tttgtttatt tttctaaata cattcaaata tgtatccgct catgagacaa 4080 
taaccctgat aaatgcttca ataatattga aaaaggaaga gtatgagtat tcaacatttc 4140 
cgtgtcgccc ttattccctt ttttgcggca ttttgccttc ctgtttttgc tcacccagaa 4200 
acgctggtga aagtaaaaga tgctgaagat cagttgggtg cacgagtggg ttacatcgaa 4260 
ctggatctca acagcggtaa gatccttgag agttttcgcc ccgaagaacg ttttccaatg 4320 
atgagcactt ttaaagttct gctatgtggc gcggtattat cccgtattga cgccgggcaa 4380 
gagcaactcg gtcgccgcat acactattct cagaatgact tggttgagta ctcaccagtc 4 440 
acagaaaagc atcttacgga tggcatgaca gtaagagaat tatgcagtgc tgccataacc 4500 
atgagtgata acactgcggc caacttactt ctgacaacga tcggaggacc gaaggagcta 4560 
accgcttttt tgcacaacat gggggatcat gtaactcgcc ttgatcgttg ggaaccggag 4 620 
ctgaatgaag ccataccaaa cgacgagcgt gacaccacga tgcctgtagc aatggcaaca 4 680 
acgttgcgca aactattaac tggcgaacta cttactctag cttcccggca acaattaata 4740 
gactggatgg aggcggataa agttgcagga ccacttctgc gctcggccct tccggctggc 4800 
tggtttattg ctgataaatc tggagccggt gagcgtgggt ctcgcggtat cattgcagca 4860 
ctggggccag atggtaagcc ctcccgtatc gtagttatct acacgacggg gagtcaggca 4920 
actatggatg aacgaaatag acagatcgct gagataggtg cctcactgat taagcattgg 4980 
taactgtcag accaagttta ctcatatata ctttagattg atttaaaact tcatttttaa 5040 
tttaaaagga tctaggtgaa gatccttttt gataatctca tgaccaaaat cccttaacgt 5100 
gagttttcgt tccactgagc gtcagacccc gtagaaaaga tcaaaggatc ttcttgagat 5160 
cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct accagcggtg 5220 
gtttgtttgc cggatcaaga gctaccaact ctttttccga aggtaactgg cttcagcaga 5280 
gcgcagatac caaatactgt ccttctagtg tagccgtagt taggccacca cttcaagaac 5340 
tctgtagcac cgcctacata cctcgctctg ctaatcctgt taccagtggc tgctgccagt 5400 
ggcgataagt cgtgtcttac cgggttggac tcaagacgat agttaccgga taaggcgcag 54 60 
cggtcgggct gaacgggggg ttcgtgcaca cagcccagct tggagcgaac gacctacacc 5520 
gaactgagat acctacagcg tgagcattga gaaagcgcca cgcttcccga agggagaaag 5580 
gcggacaggt atccggtaag cggcagggtc ggaacaggag agcgcacgag ggagcttcca 5640 
gggggaaacg cctggtatct ttatagtcct gtcgggtttc gccacctctg acttgagcgt 5700 
cgatttttgt gatgctcgtc aggggggcgg agcctatgga aaaacgccag caacgcggcc 5760 
tttttacggt tcctggcctt ttgctggcct tttgctcaca tgttctttcc tgcgttatcc 5820 
cctgattctg tggataaccg tattaccgcc tttgagtgag ctgataccgc tcgccgcagc 5880 
cgaacgaccg agcgcagcga gtcagtgagc gaggaagcgg aagagcgccc aatacgcaaa 5940 
ccgcctctcc ccgcgcgttg gccgattcat taatgcagct ggcacgacag gtttcccgac 6000 
tggaaagcgg gcagtgagcg caacgcaatt aatgtgagtt agctcactca ttaggcaccc 6060 
caggctttac actttatgct tccggctcgt atgttgtgtg gaattgtgag cggataacaa 6120 
tttcacacag gaaacagcta tgaccatgat tacgccaagc tgtaagttta aacatgatct 6180 
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tactaactaa ctattctcat ttaaattttc 
acgatggata cgctaacaac ttggaaatga 
atactgcttt tccttgcaaa attcggtgct 
cgagcatttc cttctttgac ttctctcttt 
atttgattac acacactcag agttcttcga 
taacagtaca tgacaagggc ggattattat 
gtgctcattt tggagccccg tgtggggcag 
gagggggacc ttatcgaacg tcgcatgagc 
tctctcacac attaatcgat tcatagactc 
agcttttttt cccgatttgt aaaaggaaga 
gccgaaaaaa catccgtagt aagtcttcct 
ccgctagttt tgaagtttaa attttaaaaa 
ctaaataggc aaaagttttt tcaagaactc 
gaaaaattct tgttttaaat ttaatattta 
gaaaattctc aattaaaaga atttgccgat 
agctcaaatt agaagtttgt ttttaaagga 
ttcctccccg cgtaaaatta gttgttgtga 
gctcttcaca ccgtgcttcc tctcacttga 
gacggtgaat tgccttatca agagcgtcgt 
tttctttact ttatatttat gtaggtcaca 
ttgtctccac accacctccg cctccagcag 
ttccttgtat gccgcggtcg tgaatgcacc 
actttcggtg gtattactac acgtcatcgt 
ctcactcctc attattcccc ttggtgtatt 
cgtttctacc ttcttgtttt ccgtccattt 
taggccagag agttctagtt cttgttctaa 
ctcatcaaaa aaaaagttga taaagaaaac 
aattgattca aaaaaaaatt cccacccctc 
ttttttcctt ctcgattcaa ttttttgctg 
gcaaaagatg agagaattta caaacagaaa 
tatccgagat tgatgggaaa gatattaatg 
agattttcgc atcaaactcc gccttttaca 
catcattttt aatagaaaat cgagaaatta 
acacgaaatt tgtgggtctc gccacgatct 
ttataaaatt tcaaattcta gtgtttaatt 
gtcatcattt tcaacaagaa atcgtgaaaa 
atacacgaaa tatatggtaa tttcccgaaa 
agtggccagg atttatcacg aaaaaagttc 
ctaaattagt tttttgtcat cattttgaac 
attttcgtcg cttttctctc caaaaatgac 
accatgatat cttttctccc cgtttttcat 
ggtcaccacc accgccagtg tgtgccatat 
tttatgtgtt atccctgctt gatttcgttc 
agccgagaag ctccagagaa tggagcacac 
tggcgccgcc tgggaagccg ccgagagata 
tgccacctca tctctcgagt ttctccgcct 
ttgtcgtaaa gagatgtttt tattttactt 
gctcagtgtt ggctgtgtgc tcgggctcct 
ttctctcctg ctctcgctta tcacttcttc 
tagcatttct tactttattt atttttttca 
ttgggttaca gccgtcaaca gatccccggg 
aatgatacta acataacata gaacattttc 
tcagaaaaa 



agagcttaaa aatggctgaa atcactcaca 6240 
aataagcttg catgcctgca gagcaaaaaa 6300 
ttcttcaaag agaaactttt gaagtcggcg 6360 
ccgccaaaaa gcctagcatt tttattgata 6420 
catgataaag tgtttcattg gcactcgccc 6480 
cgatcgatat tgaagacaaa ctccaaatgt 6540 
ctgctctcaa tatattacta gggagacgag 6600 
cattctttcfc tctttatgca ctctcttcac 6660 
ccatattcct tgatgaaggt gtgggttttt 6720 
ggctgacgat gttaggaaaa agagaacgga 6780 
tttaagccga cactttttag acagcattcg 6840 
ataaaaatta gtttcaattt tttttaatta 6900 
tagaaaaact agcttaattc atgggtacta 6960 
tcttaagatg taattacgag aagctttttt 7020 
ttagaataaa agtcttcaga aatgagtaaa 7080 
aaaacacgaa aaaagaacac tatttatctt 7140 
taatagtgat ccgctgtcta tttgcactcg 7200 
cccaacagga aaaaaaaaca tcacgtctga 7260 
ctctttcacc cagtaacaaa aaaaatttgg 7320 
aaaaaaaagt gatgcagttt tgtgggtcgg 7380 
cacacaatca tcttcgtgtg ttctcgacga 7440 
acattcgacg cgcaactaca caccacactc 7500 
tgttcgtagt ctcccgctct ttcgtcccca 7560 
gatttttttt aaatggtaca ccactcctga 7620 
agattttatc tggaaatttt tttaaaattt 7680 
aagtctaggt cagacataca ttttctattt 7740 
tggttattca gaaagagtgt gtctcgttga 7800 
gcttgtttct caaaatatga gatcaacgga 7860 
cgctctgtct gccaaagtgt gtgtgtccga 7920 
tgaaaaaaag ttggccaaat aatgaagttt 7980 
ttctttacgg tttggagggg agagagagat 8040 
tgtcttttag aatctaaaat agatttttct 8100 
cagtaatttc gcaattttct tgccaaaaat 8160 
cggtcttagt ggttcatttg gtttaaaagt 8220 
tccgcataat tggacctaaa atgggttttt 8280 
tcctgttgtt tcgcaatttt cttttcaaaa 8340 
tattgagggt ctcgccacga tttcagtcac 8400 
gcctagtctc acatttccgg aaaaccgaat 8460 
aaaaaatcga gacatcccta tagtttcgca 8520 
agtctagaat taaaattcgc tggaactggg 8580 
tttatttttt attacactgg attgactaaa 8640 
cacacacaca cacacacaca atgtcgagat 8700 
cgttgtctct ctctctctat tcatcttttg 8760 
aggatcccgg cgcgcgatgt cgtcgggaga 8820 
tcagggaaga tcgtctgatt tctcctcgga 8880 
gttactccct gccgaacctg atatttcccg 8940 
tacaccgggt cctctctctc tgccagcaca 9000 
gccaccggcg gcctcatctt cttcttcttc 9060 
attcattctt attccttttc atcatcaaac 9120 
attttcaatt ttcagataaa accaaactac 9180 
attggccaaa ggacccaaag gtatgtttcg 9240 
aggaggaccc ttgcttggag ggtaccgagc 9300 
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